Hire June Kim
I build research systems that can be inspected, rerun, and judged by their artifacts: agent pipelines, evaluation harnesses, review protocols, and tools that turn AI research questions into working software.
I am an AI systems / research engineer focused on agentic workflows, LLM quality evaluation, provenance, memory, review loops, and developer tools. I have 10+ years of software engineering experience across Google, Loom, and startups.
The common thread is anti-credence, pro-merit: replace trust in credentials, affiliation, and polished claims with artifacts that earn trust through evidence, provenance, review, execution, and use.
The best fit is a team that needs someone between research and product: build the system, run it against real workflows, measure what breaks, and turn the result into a better product or protocol.
Forward me for
- Research engineering for AI systems, agents, coding tools, or evaluations.
- Applied AI work where ambiguous research ideas need to become shipped systems.
- Developer tools, AI infrastructure, or agentic workflow products.
- Founding or early engineering roles around research-adjacent products.
Proof
- SWE-bench Verified agent pipeline: reproducible recon/craft/audit pipeline with official grader artifacts, committed wins and losses, explicit exclusions, and an append-only audit trail.
- Slop Slope: tested whether adversarial review loops turn test-passing LLM code from coin-flip drafts into merge-ready artifacts.
- Agentic open-source contribution: pipeline for finding issues, generating patches, submitting PRs, and tracking maintainer outcomes.
- Cognition and epistemology research: abduction, memory, provenance, review loops, and agentic systems as a theory layer for visible tools.
- Sweep & Triage
- (PR) → merged
- PageLeft
- Runnable Textbooks
Resume & Contact
Resume page · PDF · Markdown