# OpenAI API Final Decision And Demo Architecture

Status: Complete with Firecrawl evidence; Tavily blocked by missing `TAVILY_API_KEY`.

Last updated: 2026-05-07

## Final Decision

Use OpenAI as the core provider for the hackathon MVP.

Build this workflow:

```text
Thai caregiver voice
-> OpenAI Realtime API or transcription fallback
-> Responses API with Structured Outputs
-> care_log JSON
-> caregiver burden score
-> safety and escalation flags
-> Thai family handoff message
-> UI timeline and weekly trend
```

Do not build a broad chatbot, provider benchmark, diagnosis tool, medication advisor, or Hugging Face-powered model pipeline for v1.

## Core OpenAI Stack

| Layer | Decision | Why |
|---|---|---|
| Voice front door | Realtime API with `gpt-realtime-1.5` | Strongest demo moment: caregiver speaks naturally in Thai during a stressful moment. |
| Voice fallback | `gpt-4o-transcribe` or `gpt-4o-mini-transcribe`; seeded transcript if needed | Protects demo reliability if microphone, latency, or Realtime setup fails. |
| Structured care data | Responses API with Structured Outputs | Converts messy caregiver narrative into schema-valid data for UI and storage. |
| Text reasoning and handoff | Responses API using latest GPT family model; start with `gpt-5.5` for quality or `gpt-5.4-mini` for latency/cost testing | Official models docs recommend starting with `gpt-5.5` for complex reasoning and using smaller models for latency/cost workloads. |
| Optional multimodal | Responses API image/file input | Later only. Useful for care cards and hospital instructions, but risky and too broad for the first demo. |

## What To Build First

### 1. Seeded Demo Data

Create a seeded Thai caregiver transcript:

```text
เมื่อคืนแม่ตื่นหลายรอบ จะออกจากบ้าน บอกว่าจะกลับบ้าน หนูแทบไม่ได้นอน
```

Use this transcript even if live voice fails.

### 2. Structured Output Endpoint

Build the structured extraction first. It is the center of the product.

Input:

- Thai transcript.
- Optional short caregiver context.
- Current safety rules.

Output:

- `care_log` JSON matching the schema in `research-output/openai-model-and-schema-decision.md`.

### 3. UI Proof

Render the JSON as:

- Incident card.
- Safety risk badge.
- Caregiver stress score.
- Immediate non-medical steps.
- Thai family handoff message.
- Weekly trend from seeded logs.

### 4. Voice Layer

Add Realtime voice after the structured endpoint works. If time is short, use browser voice recording or transcript paste and still show the OpenAI transformation.

## Safety Boundary

The product can say:

- "This is caregiver wellness support."
- "This helps document care events."
- "This helps families coordinate."
- "This can suggest non-medical next steps."
- "This can recommend contacting family, doctor, nurse, pharmacist, or local emergency help when risk is high."

The product cannot say:

- "This diagnoses dementia."
- "This treats Alzheimer's disease."
- "This changes medication."
- "This replaces a doctor."
- "This replaces emergency services."
- "This tells you to use sedatives or restraints."

## Escalation Conditions

Always escalate to local professional or emergency help when the caregiver reports:

- Missing patient.
- Immediate danger.
- Fall or head injury.
- Sudden or acute confusion.
- Violence or self-harm risk.
- Chest pain or stroke signs.
- Severe dehydration.
- Caregiver collapse or inability to continue safely.

## Hugging Face Decision

Use Hugging Face only as supporting research.

Useful:

- Thai ASR assets for future transcription evaluation.
- Dementia speech datasets for background awareness after license review.

Not useful for v1:

- Alzheimer Q&A fine-tuning.
- Dementia detection/classification models.
- Medical-imaging Alzheimer models.
- Any unclear-license caregiver/patient data.

Reason:

The hackathon MVP is not a trained diagnostic system. It is an OpenAI-powered workflow that transforms caregiver speech into safe structure and family coordination.

## Implementation Plan

### Endpoint 1: Structured Care Log

```text
POST /api/care-log
body:
  transcript_thai: string
  caregiver_context?: object
  previous_logs?: array

OpenAI:
  Responses API
  model: start with gpt-5.5, evaluate gpt-5.4-mini for latency/cost
  output: Structured Outputs JSON schema
```

### Endpoint 2: Voice Check-In

Option A, best demo:

```text
Browser microphone
-> Realtime API gpt-realtime-1.5
-> brief Thai conversation
-> transcript/summarized turn
-> /api/care-log
```

Option B, reliable fallback:

```text
Browser recording
-> gpt-4o-transcribe
-> /api/care-log
```

Option C, safest fallback:

```text
Seeded Thai transcript
-> /api/care-log
```

### Endpoint 3: Weekly Trend

```text
POST /api/weekly-trend
body:
  care_logs: array

OpenAI:
  Responses API
  output: short Thai family action summary
```

For the hackathon, this can use seeded logs.

## Demo Script

1. Show caregiver speaks Thai or clicks seeded transcript.
2. AI asks one safety follow-up:
   - "ตอนนี้คุณแม่ปลอดภัยและอยู่ในบ้านไหม?"
3. Structured care log appears.
4. UI highlights:
   - Incident: wandering / exit-seeking.
   - Risk: high.
   - Caregiver sleep: 3 hours.
   - Burden: 5/5.
5. AI shows safe next steps.
6. AI drafts Thai sibling message.
7. Weekly trend shows repeated sleep/wandering risk.

## Final Recommendation

Build the OpenAI MVP now.

Priority order:

1. Responses API with Structured Outputs.
2. Seeded Thai transcript demo.
3. Care log UI and family handoff.
4. Realtime API voice.
5. Weekly trend.
6. Optional multimodal context only if the core demo is stable.

## Evidence Sources

OpenAI:

- https://platform.openai.com/docs/guides/realtime
- https://platform.openai.com/docs/guides/speech-to-text
- https://platform.openai.com/docs/guides/structured-outputs
- https://platform.openai.com/docs/guides/text
- https://platform.openai.com/docs/guides/images-vision
- https://platform.openai.com/docs/guides/file-inputs
- https://platform.openai.com/docs/models
- https://platform.openai.com/docs/guides/latest-model

Hugging Face:

- https://huggingface.co/datasets/SEACrowd/thai_elderly_speech
- https://huggingface.co/datasets/mcshao/EThai-ASR
- https://huggingface.co/collections/tawankri/thai-asr
- https://huggingface.co/datasets/MearaHe/dementiabank
- https://huggingface.co/datasets/AzizSouiai/Alzheimer

Tool limitation:

- Firecrawl was used.
- Tavily was checked but blocked because `TAVILY_API_KEY` is not configured.