🧚🏼‍♀️ Belinda Mo

Search

>

>

Experiments

3 min read

🤸🏻‍♀️ 🤸🏻‍♀️ 🤸🏻‍♀️

🧪 Experiment write-ups

generative learning sequence
21-24. Test effects of qi gong on people and objects
20. AI interface to generate flashcards about Wikipedia’s vital articles (version 1)
19. Recreate SWE-bench to collect real-world programming evaluation data
18. bmo.cafe for automating tasks
17. Try Hume.ai - an empathetic AI
16. Learn how OpenAI does public evals
15. Learn morse code with chunking and spaced repetition
14. Build a simple scalar-based neural network
13. Memorize 111 experiments using Avatar the Last Airbender

Larger umbrella of experiments:

Quantified learning

Experiment System Changelog if you’re interested in the meta of how I’m running these experiments!

Older experiments

These write-ups were created by running my written notes through ChatGPT.

12. Claude vs. GPT on GSM8K math benchmarks - How do they compare? 🟢
11. What does a useful parking ticket payer agent look like? 🔴
10. GPT vs. Gemini - What does a minimalistic user interface for model comparison look like? 🟢
9. How might a set of AI models improve code generation through iterative code and test set generation? 🟢
8. What does a good translation interface look like for same language “translation” across contexts? 🔴
7. How is the writing quality of GPT-4 + RAG for querying over my Obsidian notes? 🔴
6. What is it like to convert a Next.js app to PWA? 🟢
5. What does a suite of personal GPTs on ChatGPT look like for content like research proposals and social media posts? 🟠
4. What does it look like to automating UI Improvements with GPT-Vision? 🟢
3. How accurate is RAG + GPT-4 for an orthopedic surgeon? 🟢
2. What would a knowledge graph of science look like, generated recursively entirely by GPT-4? 🟢
1. How do I track my hours in a way that feels good? 🟠
0. Will I be able to build this site in 1 day, where I’ll stick with it? 🔴

Proposals

These are proposals that I’ve been working on

Back-translation for Alignment of LLM Generation of Code, Tests, and more
Proposal - Crowdsourcing a universal browser automation task graph with continuously running tests 🧪
AI Theorem Proving for International Math Olympiad Problems
“LLM Operating System”
Explainable Protocols for 2x+ Retention Improvements to Human Long-Term Memory
What does a better knowledge graph structure for enhancing human learning alongside LLMs look like?
What does an configurable format for a single AI agent’s profile look like?
What does a configurable format look like to represent essential data for a single person?

Graph View

🧪 Experiment write-ups
Older experiments
Proposals

Backlinks

🌱 Hello there!

Created with Quartz v4.1.0, © 2025

GitHub
Changelog
DM