Gemini Nano in Chrome 138: notes for AI Engineers

swyx 2025-07-08

at long last, Gemini Nano is almost here for all Chrome users. I was reminded by this HN post.

I don’t like the way google write docs, so this blogpost is basically me rewriting their docs in a way that fits my brain.

they have a few apis for commonly used patterns on offer, but reallly the main one you’ll care about as an engineer is the Prompt API, the most flexible/open ended one.

setup

Unlike the initial overpromise of window.ai, the current released implementation is much less “clean”. anyway here’s the current way to set it up.

make sure you have chrome 138+
go chrome://flags/#prompt-api-for-gemini-nano and turn it on (unfortunately you’ll have to reload chrome)
then download the model by calling LangaugeModel.create() for the first time - takes a few mins on home wifi. Gemini says “has an approximate download size ranging from 1.5 GB to 2.4 GB.” so lets say thats a 4-6B model at a 4-8bit quantization.

const session = await LanguageModel.create({
monitor(m) {
  m.addEventListener("downloadprogress", (e) => {
    console.log(`Downloaded ${e.loaded * 100}%`);
  });
},
// // uncomment if want multimodal input https://developer.chrome.com/docs/ai/prompt-api#multimodal_capabilities 
// expectedInputs: [
//  { type: "audio" },
//  { type: "image" }
//  ]
})

basic important things

the loaded model has 6k token context (just ask for inputQuota without any initialPrompts):

session.inputQuota
// 6144

Now unlike the Gemini Nano team, I happen to be a guy who thinks that function calling/json output is very impt, so let’s see how to get this going in Gemini Nano, with prompt examples stolen from Hamel and Jason:

const JSONschema = `<schema>
{
    "description": "Correctly extracted \`UserDetail\` with all the required parameters with correct types",
    "name": "UserDetail",
    "parameters": {
        "properties": {
            "age": {
                "title": "Age",
                "type": "integer"
            },
            "name": {
                "title": "Name",
                "type": "string"
            }
        },
        "required": [
            "age",
            "name"
        ],
        "type": "object"
    }
}
</schema>`
const JSONsession = await LanguageModel.create({
  initialPrompts: [
    { role: 'system', content: 'You are a helpful LLM that only responds in valid JSON fitting a schema: ' + JSONschema },
    { role: 'user', content: "Extract Jason is 35 years old" },
    { role: 'assistant', content: '{age: 35, name: Jason}'},
  ]
});

const result1 = await JSONsession.prompt("Extract sarah is 22 years old");
console.log(result1);
// {age: 22, name: Sarah}

pitfalls

it doesnt do great instruction following, so required fields aren’t really respected:

const result1 = await JSONsession.prompt("its been a year since vibhu's birthday, he was 28 last year, guess how old he is now");
console.log(result1);
// { "age": 29 }

the other thing is that sessions are default stateful, which can be a little nasty if you forget. So a stateless version looks like:

const baseSession = await LanguageModel.create({
  initialPrompts: // blah blah, as above
})

// you can also implement this as a class if you want to force users to use`new` keyword to make super clear it is stateless
const statelessSession = {  
	async prompt(str) {
		const clonedSession = await session.clone()
		return clonedSession.prompt(str)
    }
}

// these are all stateless calls now! yay repeatability and predictability!
const result1 = await statelessSession.prompt("Extract sarah is 22 years old");
console.log(result1);
const result2 = await statelessSession.prompt("Extract tanisha is 30 years old");
console.log(result2);

pitfalls like these are why you will probably want little wrapper libraries you can handroll or reference https://github.com/kstonekuan/simple-chromium-ai

the last tip here for non js pros is how to import those wrapper libraries in browser contexts (aka without npm install or a build step) using ESM syntax (may need <script type="module"> - run on localhost or a site with relaxed CSP):

// alternatively use https://cdn.jsdelivr.net/npm/simple-chromium-ai@0.1.1/dist/simple-chromium-ai.mjs
const ChromiumAI = await import('https://unpkg.com/simple-chromium-ai@0.1.1/dist/simple-chromium-ai.mjs');

const ai = await ChromiumAI.initialize("You are a friendly assistant");
const response = await ChromiumAI.prompt(ai, "Tell me a joke");
console.log(response);
const ChromiumAI = await import('https://unpkg.com/simple-chromium-ai@0.1.1/dist/simple-chromium-ai.mjs');

const ai = await ChromiumAI.initialize("You are a friendly assistant");
const response = await ChromiumAI.prompt(ai, "Tell me a joke");
console.log(response);
// Why don't scientists trust atoms?  Because they make up everything! 

// and of course... the structured output implementation now works:
const schema = {
  type: "object",
  properties: {
    sentiment: {
      type: "string",
      enum: ["positive", "negative", "neutral"]
    },
    confidence: {
      type: "number",
      minimum: 0,
      maximum: 1
    },
    keywords: {
      type: "array",
      items: { type: "string" },
      maxItems: 5
    }
  },
  required: ["sentiment", "confidence", "keywords"]
};

// Create session with response constraint
const response = await ChromiumAI.prompt(
  ai, 
  "Analyze the sentiment of this text: 'I love this new feature!'",
  undefined, // no timeout
  { responseConstraint: schema }
);

// Response will be valid JSON matching the schema
const result = JSON.parse(response);
console.log(result);

Table of Contents

Gemini Nano in Chrome 138: notes for AI Engineers

setup

basic important things

pitfalls

Latest Posts

Table of Contents [X]

Gemini Nano in Chrome 138: notes for AI Engineers

setup

basic important things

pitfalls

Latest Posts

Table of Contents