My First Two Weeks as an Agentic Engineer

by Moisés Carvalho on May 19, 2026 15 min

Table of Contents

There’s a version of this post where I tell you AI made me 10x more productive, everything clicked on the first try, and I’m now shipping software while sipping coffee with my feet up.

This is not that post.

A few weeks ago, Codeminer42 challenged me to build a side project from scratch using only AI tools. No writing code by hand. Every single line had to come from prompts. My job was to describe what I wanted, read what came back, and fix problems with more prompts. The company provided me a Claude Code subscription for this, and the rule was simple: if I wanted something to exist in the codebase, I had to ask for it.

I ended up building two projects instead of one. A full Rails 8 web app with MIDI keyboard integration, and a standalone Wear OS app for walking and jogging interval training, with haptic timers and over 150 automated tests. I published both projects. Each one taught me things I couldn’t have learned any other way.

If you’re not familiar with what Agentic Engineering means, Edy Silva wrote a great introduction on this blog that’s worth reading first.

The setup

My stack was simple: Claude Code CLI as the main agent, MCP servers (Chrome DevTools and Google Stitch for the first project) to give Claude access to the browser and design tools. VS Code for the Rails project, Android Studio for Wear OS. And a CLAUDE.md file in each project that I kept updating as I went.

One thing worth mentioning about how I work with Claude Code: I prefer to confirm every action before it happens. When Claude wants to edit a file, create a folder, or run a command, I review and approve it first. It slows things down a little, but I always know what’s happening in the codebase and I catch problems before they pile up.

Codeminer42 also provided me the Anthropic API key I needed for both projects. One of them actually calls that API from inside the app itself, but I’ll get to that.

Oh, and I hit session limits a few times during the two weeks, especially on longer days. Honestly, I get why people complain about that lately. It’s genuinely annoying when you’re in the middle of something and Claude just stops. Thankfully, Codeminer42 expanded my limits when I needed it, which saved me more than once, though it’s still something you have to plan around.

Project #1: Piano Learner

What it is

A Rails 8 web application for learning piano, with real-time MIDI keyboard integration, AI-generated song analysis powered by the Anthropic API, staff notation via VexFlow, a BPM-driven practice engine, and a scoring system. Almost 300 automated tests across RSpec and Vitest. Five days.

The song library lists every track available to practice

Teaching Claude to See

Building the song list page was one of the first real tests of the workflow. I asked Claude to build it, it built it, ran the tests, fixed a few things, and everything passed. Great, right?

Not quite. When I opened the page in the browser, the elements weren’t showing up the way I expected. The tests were green, the code looked fine, but the actual visual result was wrong. That’s when I remembered that Miguel Marcondes, a fellow developer at Codeminer42, had mentioned a Chrome MCP that lets you connect the browser directly to Claude Code. I figured this was a good moment to actually try it.

So I installed it. And Claude opened Chrome inside WSL, which was not at all what I wanted. I needed it to connect to my existing Windows Chrome instance through a remote debugging session, not spin up a new browser inside the Linux environment.

So I started a new Claude Code session and asked it to help me figure out how to change the MCP configuration to do what I actually needed. It showed me a few config files. Still, none of them worked. Next, I asked it to check the Chrome MCP GitHub repository for the correct flags to enable remote debugging, and also to look at the Claude Code docs to find where the actual MCP configuration files live. That combination worked. Claude found both pieces, I updated the right file with the right flags, and suddenly I had a live Chrome session connected to Claude Code.

From that point on, I could send screenshots directly through the MCP and ask Claude to look at what was visually wrong. That changed how I worked with the frontend for the rest of the project.

Claude calling Claude

On Day 3, I added a feature where the app calls Claude’s API to generate study guides for each song. So Claude Code, the tool I was using to build the project, was writing the code that calls Claude as a feature inside the project. The AI was programming how to talk to itself.

Honestly, that wasn’t in the original plan. Instead, it just came from what the project needed. The guides it produced were teacher-like and direct: practice drills with suggested tempos, milestone-based tips, emotional descriptions of each chord. I rewrote the prompt once because the first version wasn’t quite right. After that, it worked well.

The analysis page breaks a song into sections with suggested tempos and tips

Switching models mid-project

I used two Claude models during this project. Initially, Sonnet handled the earlier work: scaffolding, models, controllers, getting the basic structure running. Later, Opus took over when things got more complex, like the BPM timing engine, the scoring algorithm, the VexFlow integration, and improving test quality. Sonnet was faster and great at building things out quickly. However, Opus was slower but made noticeably better decisions when the problem was harder. Overall, knowing when to switch mattered more than I expected.

Google Stitch, at first

My best friend had shown me Google Stitch a few weeks before the challenge. She was excited about it, said the feature was impressive. Still, I was a bit skeptical. Actually, from what I had seen, it was still making mistakes, especially around typography.

But during the challenge, I decided to give it a proper try. First, I wrote some prompts describing the design I wanted, sent the project’s instruction file so it would have context, and then asked Stitch to generate something. Unfortunately, the results weren’t good enough. It created features that didn’t exist in the project, inconsistencies started appearing, and eventually, after a few more requests, it just stopped responding altogether.

So I settled for exporting what I had managed to get as images and using those as a visual reference for Claude locally. Nothing fancy, just screenshots to point at.

A design system, built by accident

Then, while looking through the export options, I noticed one of them said “MCP Server”. At that point I was already familiar with the Chrome MCP setup, and hoping this one would be less painful to configure, I gave it a try. It connected to Claude without much trouble.

The problem was what happened next: instead of reading my existing design and working from it, Claude used the connection to start creating a completely new design from scratch. So I stopped it before it went too far and told it to ask me questions before touching anything. Surprisingly, that actually helped. It walked me through a few questions about the direction I wanted, offered some suggestions I hadn’t considered, and eventually, from that conversation, we built a proper design system together.

Meanwhile, I was listening to the Cyberpunk 2077 anime soundtrack during all of this, and the mood of it ended up shaping the whole visual direction more than any prompt I wrote. The neon colors, the dark backgrounds, the glowing accents. Honestly, I hadn’t planned any of it. It just came from what was playing. Sometimes the best design decisions aren’t decisions at all.

The result was the interface you can see below.

Practice mode shows the upcoming notes on the staff in real time

The scoring screen appears right after you finish a song

The `CLAUDE.md` confession

One of my bigger mistakes on this project: I only started creating the CLAUDE.md file on Day 5. By then I had already spent four days re-explaining the project structure, conventions, and rules at the start of every session, because Claude has no memory between sessions.

Once I finally wrote it (around 350 lines covering architecture, routes, test commands, and a growing list of anti-patterns), future sessions picked up exactly where the last one left off. Actually, the anti-patterns section was especially valuable. Every time Claude made a mistake I corrected once, I added it to the file so it wouldn’t happen again:

“Avoid pending or empty test specs”
“Don’t write smoke-test-only request specs”
“Always test AI status query methods”

I should have started this on Day 1. That’s probably the single most useful thing I can tell anyone who is just getting started with Claude Code.

The feature we built and deleted (fun fact)

On Day 2, Claude built a complete file import pipeline: over 600 lines of working code with models, importers, and background jobs for handling MIDI and ABC music files. However, by the end of the same day, I realized we didn’t need the feature. So we deleted all of it.

In a normal project, throwing away 600 lines of working code would feel like a real loss. However, here it was just an easy call. Actually, that change in how I thought about waste was one of the more subtle things that shifted during these two weeks.

By the numbers

Calendar days	5
Total commits	55
Lines added	~16,800
Ruby code	~4,600 lines
JavaScript code	~2,000 lines
Total tests	~288 (RSpec + Vitest)
Database migrations	13