The AI Agentic Workforce I - Foundations
Conceptualizing the AI workforce in technology teams
AI thinkpieces are unfortunately ubiquitous, fairly overwritten, and rarely insightful.
This is often made worse by the fact that many such pieces are AI “assisted” to the point that any potential for original thought is lost under the dross of the LLM parrot, rendering all text in its monochromatic style.
I can make no promises that this piece will be novel, but that its goal is to offer a tentative vision for how the generic buzzword “agentic workforce” can be translated into a specific, actionable framework.
I plan to make this a series of posts.
In this first post we will unpack some foundational preliminaries and in subsequent posts I will work iteratively through the implementation of an AI agentic workforce for two case studies: AI Agile SDLC (the developer function), and AI ITSM (the DevOps function).
Both use cases will be informed by my personal experience operating within both functions.
The key will be to identify what AI is most instrumental for, where it is weakest, and varying approaches for human-in-the-loop management of the AI workforce in each of these scenarios.
I may round this out with a fourth post with general observations, important cautions around AI agents, lessons learned, and speculation of future trends. We will see.
What is the Goal?
Technology is the practical application of scientific knowledge to achieve a particular set of goals. This is a notion that goes straight back to the Greeks.
Technology is by this definition oriented as a means towards an exterior end.
If you could achieve your end without the means, there would be no need for a technology to fill the gap, so to speak.
Consequently, the less means that are needed for an end, all the better.
If a corporation’s officers are charged with a fiduciary duty for the benefit of shareholders, and the benefit for such stakeholders is profit, then it behooves the corporation’s officers to optimize technology to fulfill these fiduciary goals.
Within the strict confines of this logic, any opportunity to optimize for generative AI and agentic workflows in comparison with the human workforce is the collective obligation of corporate executives.
Discussions and anxieties around AI automation and workforce replacement cannot forget that within this single-minded framework, concerns for human welfare or perpetuating the economy through stable consumer income simply are not material considerations.
It if of course a contentious debate how other sets of concerns can moderate or mediate the strict, narrow drive of the technologist. Questions we will not dive into here.
The long and short of it is that companies are built to make money, and there is considerable interest in identifying the avenues by which AI agents may be used to increase profits.
If the technologist’s goal is to build systems that build systems in the sense of Don Knuth, then it is a non-starter that we should aim to see the extent to which current technology workflows can be abstracted and transferred to AI agents.
To put it most concretely the question is if and how we can transfer the traditional duties and responsibilities of developer, engineer, manager, and other roles into an AI-centric system-of-systems.
The Tools
Let us consider a specific suite of tools at our disposal to accomplish this goal.
For our case study, we will focus on Cursor as the tool of choice.
The AI Agent
Briefly, let us recall how an AI agent works by starting with the LLM.
Large language models (LLMs) are for the most part transformer neural networks.
During run-time, an LLM takes in text one token at a time. A token is a word or even chunk of a word. It then performs inference to provide the user with output.
LLM inference consists of probabilistic sampling across all of the model’s trained data (e.g. a snapshot of the entire Internet). Given all of the input provided, the LLM will provide a probabilistic judgment about what the next token will be, and the token after that. One token at a time until it has completed a response which is returned to the user.
There are many configuration elements at play here, but it must be emphasized that (1) LLM output is functionally indeterminate through virtue of its probabilistic model, and (2) LLMs are stateless.
There is no conversation, memory, or context at the LLM layer.
The AI agent is a higher level of abstraction which includes scaffolding to inject or provide metadata to the LLM to shape its output.
Such context can be layered:
At the model level by the model provider (GPT 5.5, Claude Opus 4.8)
The platform layer by the hosted provider (e.g. chatgpt.com),
User-defined context at the company/team/user/workspace level,
Conversation context managed by the platform or tool
Tools or particular agent commands injected into a conversation.
This is how we arrive at distinct AI agents that can be custom-tailored through context and metadata to provide reliably specialized outputs depending on their inputs.
Context Management Tooling
In particular, Cursor offers several types of context management that can be injected at different layers.
As of mid 2026 these include:
Rules: Persistent instructions to constrain the agent at a directory or workspace level. These can be configured to be mandatory
always-applyor allow agent discretionagent-decided. These are rendered into conversation context regardless of the prompt.Skills: Reusable task guidelines for specific, opinionated workflows such as working with specific platforms, tools, or frameworks. This can include general patterns, references, or granular syntax guidelines.
Commands: Reusable prompts to perform specific actions. Best for repeatable actions that can be reliably abstracted into a specific prompt either for context initialization or concrete SOPs.
Hooks: Deterministic automations that have the greatest interruptive power during AI actions to block or guide AI actions. These can run before or after AI lifecycle events to implement stricter influence than rules, skills, or commands which can functionally be neglected by the AI agent especially as context accumulates.
These can be applied or propagated at an enterprise, team, or user level. The primary advantage of enterprise or team level Cursor context abstractions is that it introduce greater regularity, determinacy, and stronger context management.
This can better standardize outcomes worked out by individual contributors whose prompts may not carry the appropriate technical context or precision necessary to avoid the “one-shot” solution approach often provided by LLM models missing proper context.
They can serve both as enablement and guardrails to protect the results of those working with AI agents.
However, there is no strict inheritance logic in weighing user-level Cursor rules or hooks vs. enterprise-level rules or hooks,1 so it would be fairly easy for the individual contributor or team to attempt to bypass enterprise governance controls set via Cursor.
Optimizing Context Management
Context Overload
These Cursor native context management tools have limitations.
From a user perspective, the most apparent limitation is the frequent tendency of AI agents to simply overlook or ignore these context controls that are passed in to the LLM.
This is particularly apparent for nuanced tasks such as working in technology where dozens or hundreds of considerations or constraints must be applied in the execution of a particular task.
The largest contributing factor is context overload.
When you are passing in a large number of Cursor rules, skill, or hooks, and this is snowballed by what is being enforced at the team or enterprise level, in addition to system prompts Cursor itself injects, the context itself becomes bloated. In turn, the LLM functionally treats most of it as noise in order to perform its probabilistic calculation of the next token.
In this way, AI “thinking” is not unlike how humans deal with too much details. Noise is filtered out.
Unfortunately, at this time AI agents do not have sophisticated judgment in what is popularly called the “signal-to-noise” ratio. From a human perspective, the probabilistic output can have truly bizarre prioritization of conflicting details or considerations.
It remains to be seen whether model or platform-context tuning can truly enable the LLM to reach a capacity of judgment making expected of human professionals. It may not simply be possible with LLMs.2 (It is also complicated by the fact that human professionals can have formal accountability structures for their decision making in ways the LLM cannot.
Such speculation aside, the key recommendation to improve AI agentic outcomes here is to reduce the amount of context that is injected into agentic conversations.
Shorter and fewer Cursor rules, hooks, skills.
This introduce an economic factor into context management.
The law of diminishing returns entails a judgment of how to most efficiently optimize and manage agentic AI context before performance degrades to an intolerable degree.
There are two solutions I have found effective to mitigate this: (1) context scoping and (2) tool-augmented context scoping.
Context Glob Scoping
Context scoping is built into how Cursor allows you to define these context management tools by scoping them to particular directories, tasks, platform use cases, etc.
To take this one step further is to model your local workspace directory structure around your AI agentic workflows. Document proposals should be separated from discovery assessments which should be separated from Git repositories.
It is worth conducting a meta-workflow assessment with an AI agent that instructs it to review the latest Cursor best practices and then feeds it your personal contextual specifics on day-to-day tasking to build out a local glob structure that can then be effectively mapped into how you scope your Cursor rules, skills, etc.
Your directory architecture itself should be baked into a Cursor rule that is prioritized for each agentic workflow.
This can both reduce bloat within what content you have and at the very least alleviate context bloat by only selectively loading in particular context notes depending on the task or purpose of the workflow you are engaged in.
The AI agent will only receive the context it needs, trimming excess bloat and reducing the probability of hallucination.
Tool-Augmented Context Retrieval
Tool-augmented context retrieval is a general term that could be likened to memory management through address pointers.
Computer science is, in one sense, the economics of mathematics.
Computers apply formal abstractions under finite constraints, so data, memory, etc. must be represented and managed economically.
One solution to this problem is the use of references or pointers. Rather than duplicating a large object, it can be more efficient to store a small address or handle that points to where the data lives rather than storing the entire data entry itself.
This same principle can be transposed to Cursor rules, skills, and hooks which can be defined as indexes that merely point to data sources external to the AI context prompts whether this be local static documents, static content available external to the workstation, or dynamic data and live platforms exposed through Model Context Protocol (MCP) servers.
We will not delve into the specifics of MCPs here.
The point is that Cursor’s context tools can be built into an addressing system for broad swathes of information that can inform agentic workflows whether this be:
Frameworks and standards
Company policies
Documentation and style guides
Internal or external databases
Integrations with cloud service providers, observability platforms, etc.
This builds in an extra layer of abstraction for the AI agent to be able to peruse broader swathes of information without “hard-coding” all possible data referents in a static set of rules and hooks.
It is a useful paradigm shift toward thinking in terms of the “systems building systems” model, and at least a step closer toward making the unwieldy, chaotic LLM a finite state machine of sorts.
Conclusion
These are general foundations around context management in general as well as other foundational considerations that are worth ironing out before proceeding into a systematic analysis of constructing an AI agentic workforce.
In the next post, I will consider the hypothetical scenario of a tech product SaaS startup and how the various functions of the SDLC process can be abstracted into an agentic framework, and how such a framework would scale if we are following the traditional SaaS startup route (and perhaps where the agentic workforce would necessarily diverge).
From there, I will apply the same approach to IT Service Management (DevOps, infrastructure, platform etc.).
I am not able to find public documentation on this point but this information was shared with me by Cursor representatives during a workshop session.
There are no shortage of ML experts who remain adamant AGI cannot be reached with the LLM. Most seem to point to predictive ML as the prize racehorse here, even if it has fallen out of favor with the advent of the LLM.
