<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Shin Li</title><link>https://shin13.github.io/</link><description>Recent content on Shin Li</description><generator>Hugo</generator><language>en-US</language><copyright>Shin Li</copyright><atom:link href="https://shin13.github.io/index.xml" rel="self" type="application/rss+xml"/><item><title>The 7 Skills You Need to Build AI Agents</title><link>https://shin13.github.io/notes/the-7-skills-you-need-to-build-ai-agents/</link><pubDate>Wed, 13 May 2026 10:42:46 +0800</pubDate><guid>https://shin13.github.io/notes/the-7-skills-you-need-to-build-ai-agents/</guid><description>&lt;p&gt;IBM Technology&amp;rsquo;s &lt;em&gt;The 7 Skills You Need to Build AI Agents&lt;/em&gt; makes a point that feels increasingly true: if an agent can act in the real world, then prompt writing is only the starting point.&lt;/p&gt;</description></item><item><title>[Dev] Following a Goal with Codex (/goal)</title><link>https://shin13.github.io/notes/following-a-goal-with-codex/</link><pubDate>Tue, 12 May 2026 06:00:00 +0800</pubDate><guid>https://shin13.github.io/notes/following-a-goal-with-codex/</guid><description>&lt;p&gt;I have been looking for a clean way to explain what &lt;code&gt;/goal&lt;/code&gt; really does in Codex.&lt;/p&gt;
&lt;p&gt;The most useful mental model I found is simple: &lt;code&gt;/goal&lt;/code&gt; is not a prettier prompt. It is a working contract for long-running agent work. You are telling the agent what success looks like, what the boundary is, and how to know when to stop.&lt;/p&gt;
&lt;p&gt;That framing matters because the feature is built for work that outlives one turn. If the objective is durable enough, the agent can keep making progress, validate its own steps, and come back to you with a result instead of a half-finished thought.&lt;/p&gt;</description></item><item><title>[Dev] Learning from Matt Pocock’s Agent Skills</title><link>https://shin13.github.io/notes/learning-from-matt-pocock-agent-skills/</link><pubDate>Thu, 07 May 2026 08:53:00 +0800</pubDate><guid>https://shin13.github.io/notes/learning-from-matt-pocock-agent-skills/</guid><description>&lt;p&gt;I recently read Matt Pocock’s article, &lt;a href="https://www.aihero.dev/5-agent-skills-i-use-every-day"&gt;“5 Agent Skills I Use Every Day”&lt;/a&gt;. It resonated with my experience using coding agents such as Claude Sonnet and Claude Opus.&lt;/p&gt;
&lt;p&gt;The article gave me a clearer language for something I have been feeling: good agent work depends on good engineering process. We need better questions, written context, small slices, tests, and codebases that agents can understand.&lt;/p&gt;</description></item><item><title>[Dev] Trying Reflex (Python) for Web Apps</title><link>https://shin13.github.io/notes/trying-reflex-python-for-web-apps/</link><pubDate>Wed, 06 May 2026 03:10:00 +0800</pubDate><guid>https://shin13.github.io/notes/trying-reflex-python-for-web-apps/</guid><description>&lt;p&gt;I’ve been using Streamlit for quick internal tools and dashboards, but a colleague introduced me to Reflex, so I’m trying it out as another way to build Python web apps.&lt;/p&gt;
&lt;p&gt;What caught my attention is that Reflex is a full-stack Python framework for building web apps with UI, state, backend logic, data models, and deployment in one codebase. This is especially suitable for Python backend developers who seek to build more scalable and production-ready web apps.&lt;/p&gt;</description></item><item><title>[Research] Optimizing Order Sets With a Large Language Model–Powered Multiagent System</title><link>https://shin13.github.io/notes/optimizing-order-sets-with-large-language-model-powered-multiagent-system/</link><pubDate>Tue, 18 Nov 2025 09:42:00 +0800</pubDate><guid>https://shin13.github.io/notes/optimizing-order-sets-with-large-language-model-powered-multiagent-system/</guid><description>&lt;h2 id="paper-overview"&gt;Paper Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Title:&lt;/strong&gt; Optimizing Order Sets With a Large Language Model–Powered Multiagent System&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Liu S, Huang SS, McCoy AB, Wright AP, Horst S, Wright A&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Journal:&lt;/strong&gt; JAMA Network Open&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2025&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DOI:&lt;/strong&gt; &lt;a href="https://doi.org/10.1001/jamanetworkopen.2025.33277"&gt;https://doi.org/10.1001/jamanetworkopen.2025.33277&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="why-this-paper"&gt;Why This Paper?&lt;/h2&gt;
&lt;p&gt;I read this paper because it sits at the intersection of clinical pharmacy, healthcare workflow, and practical AI systems.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Relevant to clinical decision support and order-set maintenance&lt;/li&gt;
&lt;li&gt;Uses a multiagent LLM design instead of a single-model prompt&lt;/li&gt;
&lt;li&gt;Shows the gap between factual correctness and actual clinical usefulness&lt;/li&gt;
&lt;li&gt;Offers a good example of expert alignment in a high-stakes domain&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This article is a cleaned-up conversion of my original blog post into the site’s Notes format.&lt;/p&gt;</description></item><item><title>[Research] A Framework for Human Evaluation of Large Language Models in Healthcare Derived from Literature Review</title><link>https://shin13.github.io/notes/framework-for-human-evaluation-of-large-language-models-in-healthcare-derived-from-literature-review/</link><pubDate>Fri, 14 Nov 2025 14:47:00 +0800</pubDate><guid>https://shin13.github.io/notes/framework-for-human-evaluation-of-large-language-models-in-healthcare-derived-from-literature-review/</guid><description>&lt;h2 id="paper-overview"&gt;Paper Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Title:&lt;/strong&gt; A Framework for Human Evaluation of Large Language Models in Healthcare Derived from Literature Review&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Thomas Yu Chow Tam, Sonish Sivarajkumar, Sumit Kapoor, Alisa V. Stolyar, Katelyn Polanska, Karleigh R. McCarthy, Hunter Osterhoudt, Xizhi Wu, Shyam Visweswaran, Sunyang Fu, Piyush Mathur, Giovanni E. Cacciamani, Cong Sun, Yifan Peng, and Yanshan Wang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Journal/Conference:&lt;/strong&gt; npj Digital Medicine&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2024&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DOI/Link:&lt;/strong&gt; &lt;a href="https://doi.org/10.1038/s41746-024-01258-7"&gt;https://doi.org/10.1038/s41746-024-01258-7&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This scoping review analyzes 142 studies of human evaluation for healthcare LLMs and argues that current practice is inconsistent, under-specified, and often too weak for high-risk clinical use cases.&lt;/p&gt;
&lt;h2 id="selected-figures"&gt;Selected Figures&lt;/h2&gt;
&lt;h3 id="figure-1-healthcare-applications-of-llms"&gt;Figure 1. Healthcare applications of LLMs&lt;/h3&gt;
&lt;p&gt;&lt;img alt="Fig. 1: Healthcare applications of LLMs." loading="lazy" src="https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41746-024-01258-7/MediaObjects/41746_2024_1258_Fig1_HTML.png"&gt;&lt;/p&gt;
&lt;p&gt;This figure shows where human evaluation has been used most often: clinical decision support, medical education, patient education, and question answering.&lt;/p&gt;
&lt;h3 id="figure-7-quest-human-evaluation-framework"&gt;Figure 7. QUEST human evaluation framework&lt;/h3&gt;
&lt;p&gt;&lt;img alt="Fig. 7: The proposed QUEST human evaluation framework, delineating the multi-stage process for evaluating healthcare-related LLMs." loading="lazy" src="https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41746-024-01258-7/MediaObjects/41746_2024_1258_Fig7_HTML.png"&gt;&lt;/p&gt;
&lt;p&gt;This is the most important figure in the paper because it turns the review findings into a practical evaluation workflow.&lt;/p&gt;
&lt;h3 id="figure-9-prisma-flow-diagram"&gt;Figure 9. PRISMA flow diagram&lt;/h3&gt;
&lt;p&gt;&lt;img alt="Fig. 9: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram of the article screening and identification process." loading="lazy" src="https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41746-024-01258-7/MediaObjects/41746_2024_1258_Fig9_HTML.png"&gt;&lt;/p&gt;
&lt;p&gt;This figure summarizes the literature search and screening process behind the 142 included studies.&lt;/p&gt;</description></item><item><title>Markdown Syntax Guide</title><link>https://shin13.github.io/notes/markdown/</link><pubDate>Sun, 01 Sep 2024 05:34:09 +0800</pubDate><guid>https://shin13.github.io/notes/markdown/</guid><description>&lt;p&gt;This article offers a sample of basic Markdown syntax that can be used in Hugo content files.&lt;/p&gt;</description></item><item><title>First Post</title><link>https://shin13.github.io/notes/first-post/</link><pubDate>Sun, 01 Sep 2024 00:08:39 +0800</pubDate><guid>https://shin13.github.io/notes/first-post/</guid><description>&lt;h2 id="hit-the-ground-running"&gt;Hit the ground running&lt;/h2&gt;
&lt;p&gt;This is my first post.&lt;/p&gt;</description></item><item><title>About</title><link>https://shin13.github.io/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://shin13.github.io/about/</guid><description>&lt;p&gt;I’m Shin Li, a pharmacist, engineer, healthcare AI researcher, educator, and lifelong learner based in Taipei.&lt;/p&gt;
&lt;p&gt;I spend much of my time at the edges between domains: clinical pharmacy and software, healthcare workflows and AI systems, research and teaching, structure and creativity. I like making complex things easier to understand, and I care about tools that are not only technically interesting, but also useful in real clinical and human contexts.&lt;/p&gt;</description></item><item><title>Now</title><link>https://shin13.github.io/now/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://shin13.github.io/now/</guid><description>What I am focused on at this point in life.</description></item><item><title>Projects</title><link>https://shin13.github.io/projects/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://shin13.github.io/projects/</guid><description>Things I’ve built, organized, studied, or kept returning to.</description></item></channel></rss>