Recent posts
17 September 2025
Do humans review my ChatGPT agent content?
Your ChatGPT agent content (including screenshots) may be accessed by a limited number of authorized OpenAI personnel, as well as trusted service providers that are subject to confidentiality and security obligations in order to: (1) investigate abuse or a security incident; (2) provide support to you if you reach out to us with questions about your account; (3) handle legal matters; or (4) improve model performance (unless you have opted out). Access to content is subject to technical access controls and limited only to authorized personnel on a need-to-know basis. Additionally, we monitor and log all access to user content and authorized personnel must undergo security and privacy training prior to accessing any user content.
Vitally important to be aware of this from a GDPR perspective.
09 September 2025
During an agentic coding experiement using Opencode with Grok Code Fast 1
, I was met with an interesting response in the form of a reasoning loop. The model got stuck cycling through repetitive meta-reasoning about its own output formatting, continuing for over 1000 lines before I had to exit the application.
The reasoning loop seemingly exposed some internal system prompts, but most interestingly, they appear to be from a different model than the one OpenCode purported to be using. Here are the system prompts that became visible during the loop:
- But guidelines say avoid unnecessary preamble.
- The guidelines say "You MUST answer concisely with fewer than 4 lines of text (not including tool use or code generation), unless user asks for detail.
- The guidelines say "One word answers are best." but this is not a simple question.
- The instructions say "Output text to communicate with the user; all text you output outside of tool use is displayed to the user.
The loop itself looked something like this, repeating endlessly:
Yes.
To box it, but in the format, it's not boxed, it's just the text.
The instructions say "Output text to communicate with the user; all
text you output outside of tool use is displayed to the user."
So, the final text is the response.
Yes.
The answer is the response.
Yes.
To box it, but in the format, it's not boxed, it's just the text.
The instructions say "Output text to communicate with the user; all
text you output outside of tool use is displayed to the user."
So, the final text is the response.
Yes.
The answer is the response.
Yes.
After checking online, these prompts resemble reported Claude-style instructions, though not ones that Anthropic has published in their release notes. The interesting part is that I was using Grok Code Fast 1
inside Opencode, so it’s unclear why there were Claude-style prompts showing up at all.
08 September 2025
Borrowing another idea from Simon Willison’s blog, I’ve added a tools page where I’ll be posting experimental HTML applications, mostly the result of vibe coding with LLMs. I like this as a way of capturing what these models are capable of at a given point in time, and I’m interested to see how the complexity of the apps evolves as I get more comfortable with this style of ‘coding’. I’m not trying to take humans out of the loop (and would never advocate that), but asking an LLM to generate a complete working app from just one or a handful of prompts, even if the result is small, feels like a useful benchmark. It also gives generative AI that ‘sci-fi’ feel that makes it so fascinating to use.
03 September 2025
As a novel way to try out new model releases, Simon Willison is known for asking them to produce an SVG of a pelican riding a bicycle. So for this blog, I thought I’d try my own benchmark, by combining my love of Brighton and sport.
Generate an SVG of a seagull playing basketball
Interestingly, both Llama and Gemini initially refused to provide an SVG, saying they are text-based and can’t create one. I asked them if they could produce code — which they confirmed they could — and then questioned how that was any different from generating an SVG. After that, they produced the SVG markup.
gpt-5
Claude 4 Sonnet
Grok 4 Heavy
Llama 4 Maverick
Gemini 2.5 Flash
02 September 2025
With this blog being focused on AI, it felt fitting to use its creation as a chance to test the current capabilities of LLMs. I decided to use the recently released gpt-5
model, starting by asking for the pros and cons of four blog options:
- Use an existing platform such as Hashnode or Medium
- Build a simple HTML document
- Write in Markdown and publish to GitHub Pages
- Build a full platform with a backend
I added some scope: I didn’t want to spend more than a few days building it, and I liked the idea of the blog’s creation itself being part of the AI experiment. The recommendation? Option 3 — exactly what I would have chosen myself.
I knew you could publish to GitHub Pages using Markdown, but I hadn’t realised it required Jekyll, a tool I’d never used before. This turned into the perfect use case: To build the entire blog without once looking at the Jekyll documentation.
[...read more]29 August 2025
Just setting up my Blog.