TL;DR
Setting up an agent workflow feels like coding with unnecessary complexity. Not really worth it if you’re going full prompt engineering.
Background
My workplace doesn’t have AI mandates or licenses. We can’t use copilot or put company code into ChatGPT. I mainly use ChatGPT as enhanced search and for AWS template generation.
Because of FOMO, I decided to test a complete agent workflow on a personal project. Main goal was figuring out if this would actually make me faster or more productive.
The Project
- Swift iOS app with SwiftUI (never did mobile dev before)
- Python backend using Flask/FastAPI
- CI/CD with GitHub Actions, Docker, bash scripts
- Deployed on DigitalOcean server
Agent Setup
- Cursor IDE (normally use basic text editor)
- ChatGPT Plus subscription + API integration
- Budget focused approach
My Workflow Structure
I organized everything into 4 main folders:
templates/- reusable prompts likeauth-login-handler.mdexamples/- sample functions, tests, validation patterns in my coding styleschemas/- API contracts, data models, business ruleshistory/- tracking log of agent modifications
ChatGPT suggested this structure after some discussion.
What Worked Well
- Cursor can reference documentation directly. You can ask “based on @framework-docs, what does this function return?”
- Generates massive amounts of code super fast
- Works decently when you accept 80-90% quality output
- Excellent for reviewing unfamiliar code (Swift was new to me)
- Great at answering “does this follow best practices per @language-docs”
- Helpful for translation like “convert this Python logic to Swift”
What Didn’t Work
- Creating detailed schemas and prompts means you’ve already done the hard thinking
- System design, architecture, API structure still requires human brain power
- Huge review overhead since I didn’t write the generated code
- Need human involvement for either writing tests OR writing code (not both)
- Ignores style guides and does whatever it wants anyway
- Regenerates entire files instead of targeted changes, breaking working code
Key Realization
Writing code isn’t actually my bottleneck. This became super obvious when drowning in tons of generated code that wasn’t particularly useful.
Better Use Cases
- Library/language questions
- Specific isolated tasks like “create CloudFormation lambda function”
- Code review for unfamiliar languages
- System design feedback
Example Template
Using conventions from `templates/code_style.md`
Following patterns in:
- `examples/validation_schema.py` for validation setup
- `examples/api_endpoint.py` for route structure
Build a Flask endpoint for login at `/authenticate`
### Specs:
**Validation:**
- Use validation schema matching `schemas/login_contract.json`
**Endpoint:**
- `/authenticate` POST only
- Validate request with schema
- Success: use `db.commit()`, return 200 with token
- User not found: raise `AuthenticationError`, return 401
- Add `@swagger` decorator for docs
- No session commit on validation failures
### Style Requirements:
- Match reference file patterns exactly
- Keep code clean per style guide
## Logging:
Add entry to `history/changes.md` with today's date summarizing additions
Schema Example
{
"title": "LoginCredentials",
"type": "object",
"properties": {
"email": {
"type": "string",
"format": "email"
},
"password": {
"type": "string",
"minLength": 6
}
},
"required": ["email", "password"],
"additionalProperties": false
}
Honestly felt more frustrating than helpful. When you need pseudo-code level detail in prompts plus extensive review time, it seems like extra work rather than a productivity boost.