# Nano Banana 2 Partner Node Notes

Session notes for the ComfyUI partner node used to run Nano Banana 2.

## What was confirmed

- The node is exposed by ComfyUI as `GeminiNanoBanana2` and `GeminiNanoBanana2V2`.
- Display name: `Nano Banana 2`.
- The node description says it generates or edits images synchronously via Google Vertex API.
- The node advertises hidden input `api_key_comfy_org`, which is *not* the same thing as the Comfy Cloud auth key.
- The workflow runner accepts `--partner-key`, and maps it to `extra_data.api_key_comfy_org`.

## Practical usage

- Discover partner nodes via `GET /object_info` on the running ComfyUI server.
- Use `python3 .../run_workflow.py --partner-key <key>` or submit REST payloads with `extra_data.api_key_comfy_org=<key>` rather than trying to stuff the key into the image prompt or cloud auth header.
- If the user previously supplied the partner key and asks not to repeat it, recover the local key without printing it by searching Hermes logs/session dumps for a `comfyui-...` token, then store it in a temp file for scripts. Never echo the value in chat or logs; report only that it was found and its length if needed.
- Prefer `response_modalities=IMAGE` and `thinking_level=MINIMAL` for a first-pass image edit/generation.
- For subtle hairstyle edits, keep the prompt narrow:
  - preserve the same bird
  - preserve body shape, beak, eye, pose, and framing
  - change only the red head feathers/hair
  - explicitly say not to remove the hair

## Pitfall observed

- A broad inpaint-style fallback can erase the hairstyle instead of restyling it.
- The partner node produced the intended result better than the local SD1.5 inpaint fallback for the same task.

## Prompt shaping lessons from this session

When the user wants a subtle hairstyle edit on a bird or similar subject, be explicit about the invariants and the desired silhouette:

- preserve the same bird / same subject identity
- preserve body shape, beak, eye, pose, and framing
- change only the red head feathers / hair region
- do not remove the hair or flatten it into a blank patch
- if the request is for a perm, brushing, or volume, ask for:
  - more bombé / lifted silhouette
  - soft brushing toward the back
  - fine strands / airy texture
  - still photorealistic and still red

These wording details materially improve the partner node output when the fallback inpaint flow tends to over-edit.

## Minimal single-image pattern

1. Load the source image.
2. Call the Nano Banana 2 partner node.
3. Save one output first before scaling to multiple variants.
4. Verify the output visually before batch generation.

## Image generation (txt2img, no input image)

Nano Banana 2 can also generate images from text alone (no reference image needed).
The `GeminiNanoBanana2` node accepts `aspect_ratio` (1:1, 4:5, 9:16, 16:9, etc.)
and `resolution` (1K, 2K, 4K) as required inputs. For txt2img, simply omit the
optional `images` input:

```python
wf = {
  '1': {'class_type': 'GeminiNanoBanana2', 'inputs': {
    'prompt': prompt,
    'model': 'Nano Banana 2 (Gemini 3.1 Flash Image)',
    'seed': random.randint(1, 2**31-1),
    'aspect_ratio': '4:5',
    'resolution': '2K',
    'response_modalities': 'IMAGE',
    'thinking_level': 'MINIMAL'
  }},
  '2': {'class_type': 'SaveImage', 'inputs': {'images': ['1', 0], 'filename_prefix': 'nano/output'}}
}
```

## Same-reference series (identity consistency)

To generate multiple images of the "same person" in different situations, pass a
reference image via the `images` input. Load it with `LoadImage`:

```python
wf = {
  '0': {'class_type': 'LoadImage', 'inputs': {'image': 'reference.png'}},
  '1': {'class_type': 'GeminiNanoBanana2', 'inputs': {
    'prompt': 'Transform the reference woman into <situation>. Preserve identity cues.',
    'model': 'Nano Banana 2 (Gemini 3.1 Flash Image)',
    'seed': random.randint(1, 2**31-1),
    'aspect_ratio': '4:5',
    'resolution': '2K',
    'response_modalities': 'IMAGE',
    'thinking_level': 'MINIMAL',
    'images': ['0', 0]
  }},
  '2': {'class_type': 'SaveImage', 'inputs': {'images': ['1', 0], 'filename_prefix': 'nano/series'}}
}
```

Identity consistency is decent but not perfect across calls. The model tends to
keep general features (skin tone, hair color, face shape) but may drift on
specifics. QA each output individually.

## Pitfall: background calls can hang

Nano Banana 2 calls via the partner node can occasionally hang (no response after
2+ minutes). The `/queue` endpoint will show the job as `queue_running` but
`/history/<prompt_id>` never populates. When this happens:

1. `curl -X POST http://127.0.0.1:8188/interrupt` to cancel the stuck job.
2. Submit the next image in the batch.
3. Use a per-image timeout (e.g. 420s) in the generation script rather than
   waiting indefinitely.

## Partner key recovery

The partner key (`comfyui-...`) is NOT the same as `COMFY_CLOUD_API_KEY`. If the
key was previously sent by the user in a chat, it can be recovered locally from
Hermes logs without asking the user again:

```python
from pathlib import Path
import re
candidates = []
for p in [Path('~/.hermes/logs/gateway.log'), Path('~/.hermes/logs/agent.log')]:
    if p.expanduser().exists():
        txt = p.expanduser().read_text(errors='ignore')
        candidates += re.findall(r'comfyui-[A-Za-z0-9_\-\.]+', txt)
key = sorted(set(candidates), key=len, reverse=True)[0]  # longest match
```

Store it in a temp file and read it in the generation script. Never print the key
to stdout or logs.

## Text-to-image workflow (no input image needed)

Nano Banana 2 can generate from prompt alone. The minimal API-format workflow is:

```json
{
  "1": {"class_type":"GeminiNanoBanana2","inputs":{
    "prompt": "...",
    "model": "Nano Banana 2 (Gemini 3.1 Flash Image)",
    "seed": 12345,
    "aspect_ratio": "4:5",
    "resolution": "2K",
    "response_modalities": "IMAGE",
    "thinking_level": "MINIMAL"
  }},
  "2": {"class_type":"SaveImage","inputs":{"images":["1",0],"filename_prefix":"output"}}
}
```

Submit with `extra_data.api_key_comfy_org` in the `/prompt` payload.

## Image-to-image with identity reference

Pass an input image via a `LoadImage` node connected to the `images` input of `GeminiNanoBanana2`. This is useful for generating variations of the same person in different situations:

```json
{
  "0": {"class_type":"LoadImage","inputs":{"image":"reference.png"}},
  "1": {"class_type":"GeminiNanoBanana2","inputs":{
    "prompt": "Using the reference woman as identity inspiration, ...",
    "model": "Nano Banana 2 (Gemini 3.1 Flash Image)",
    "seed": 12345,
    "aspect_ratio": "4:5",
    "resolution": "2K",
    "response_modalities": "IMAGE",
    "thinking_level": "MINIMAL",
    "images": ["0", 0]
  }},
  "2": {"class_type":"SaveImage","inputs":{"images":["1",0],"filename_prefix":"output"}}
}
```

Identity consistency is imperfect: the model keeps general vibe (skin tone, hair, ethnicity) but face details drift between outputs. Acceptable for a deck demonstrating "same person, different situations" but not for deepfake-level identity preservation.

## Batch generation pitfalls

- Nano Banana 2 calls can hang or take 20-30s each. Set per-image timeouts (360-420s) and use `notify_on_complete=true`.
- If a batch script blocks on a stalled API call, interrupt the ComfyUI queue (`POST /interrupt`) and kill the script rather than waiting indefinitely.
- Generate images one at a time (sequential, not parallel) to avoid overwhelming the partner API.

## Available node inputs

Key inputs on `GeminiNanoBanana2`:
- `aspect_ratio`: auto, 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
- `resolution`: 1K, 2K, 4K (2K/4K use native Gemini upscaler)
- `response_modalities`: IMAGE or IMAGE+TEXT
- `thinking_level`: MINIMAL or HIGH
- `images` (optional): reference image(s) for identity/image-to-image
- `system_prompt` (optional, advanced): override the default image-gen system prompt
- Hidden: `api_key_comfy_org`, `auth_token_comfy_org`, `unique_id`, `comfy_usage_source`

## Portrait enhancement from real photos

Nano Banana 2 excels at taking a real photo and "pimping" it into a premium
studio portrait while keeping the person recognizable. Pattern:

1. Upload the reference photo to ComfyUI input/ (use a UNIQUE filename —
   see pitfall #19 in SKILL.md about the no-overwrite behavior):
   ```python
   # POST /upload/image, multipart form, filename="flo_v2_ref.jpg"
   # Read the returned "name" field — it may differ from what you sent
   ```

2. Build workflow: LoadImage(ref) → GeminiNanoBanana2(images=ref) → SaveImage

3. Prompt pattern for studio portraits:
   ```
   Using the reference person as identity, create a premium professional
   studio portrait of this [man/woman]. Black background, dramatic cinematic
   lighting with warm gold rim light, dark elegant outfit, confident pose,
   high-end [creative director/tech] aesthetic. Sharp detail, magazine
   quality, luxury studio photography.
   ```

4. Generate 2-3 variants per person. Identity preservation is decent but
   not perfect — the model keeps general features (face shape, ethnicity,
   hair) but may drift on specifics.

5. This works well for team slides in pitch decks where you want a cohesive
   premium look across all team members regardless of their original photo
   quality.

## Critical pitfall: identity preservation can fail completely

Nano Banana 2's identity preservation is **unreliable for real person portraits**.
The model captures the "vibe" (ethnicity, hair color, general face shape) but
frequently produces outputs that the user rejects as "not resembling" the
reference person at all. In one session:

- Maria's portrait from reference photo 1 (wedding photo): rejected as
  "not ressemblant"
- Maria's portrait from reference photo 2 (different photo): also rejected
- 4 additional reference photos were collected, 4 variants generated (one
  per reference), and the user had to pick the best — still imperfect

This is worse than the general "drift on specifics" noted above. Some faces
or photo angles simply do not translate well through the Nano Banana 2
identity pipeline.

### Recovery strategy: multi-reference variant generation

When identity preservation fails:

1. Ask the user for MORE reference photos (4+ if possible, different angles,
   lighting, and contexts).
2. Upload each as a uniquely-named file (e.g. `maria_v3a.jpg`, `maria_v3b.jpg`)
   to avoid the no-overwrite pitfall (#19 in SKILL.md).
3. Generate one variant per reference photo, all with the same prompt.
4. Send all variants to the user via Telegram for visual selection.
5. Use the user-picked variant in the deck.

This is expensive (4+ API calls) but is the only reliable way to get a
resembling portrait when single-reference generation fails.

### Alternative: accept imperfection for deck use

For team slides in pitch decks, a "premium studio look" that captures the
general vibe may be acceptable even if it's not a perfect likeness. Let the
user decide — send the variants and let them choose rather than assuming
rejection.

Nano Banana 2 produces significantly less generic output than SD1.5 or Z-Image Turbo for portraits/people. Faces are more natural, skin texture is realistic, and it follows specific ethnicity/styling requests well. However, it can still produce fake text/glyphs on UI-type images (dashboards, tablets) — always QA and regenerate or blur those.
