Aiii Apply for Beta
Prometheus Beta · Now in private preview

Bringing compute
to humanity

Prometheus · AI Compute Platform

AWS Bedrock, Google Vertex AI, OpenAI and self-hosted Taiwan compute in one place — a single gateway to the best model, with AI compute from 8% off on cloud, usage-based billing and no lock-in.

From 8% offvs. official pricing
4+provider integrations
Self-hostedTaiwan compute · stays onshore
UnifiedOpenAI-compatible API
🔥
Pick a model, infer instantly
Limited-time Beta discounts
AWS Bedrock · Claude Opus 4.8
Flagship reasoning, long-context optimized
8% off
Google Vertex AI · Gemini 2.5 Pro
Multimodal, long-context understanding
5% off
OpenAI · GPT-5 / o4-mini
Broad compatibility, tool calling
5% off
Medical-specialized · Med-Gemini / Meditron 70B
Medical Q&A · clinical-reasoning fine-tuned
5% off
Aiii Gemma 4 Medical (self-hosted in Taiwan)
Taiwan data center · medical fine-tuning
Best price
Unified OpenAI-compatible API — switch providers by changing one line of base_url
💰

Up to 8% off — direct savings

We pool purchasing power across providers and pass the discount straight to you. During Beta you get our best pricing, with no minimum spend.

🔌

One API, four model families

A unified, OpenAI-compatible interface. Switching providers takes one line of base_url — your code stays the same and billing is automatically consolidated.

🇹🇼

Local Taiwan compute

A cluster of Mac Studio M3 Ultra and MacBook Pro M5 Max units hosted in a Taiwan data center. Low-latency local inference, data that stays onshore, and alignment with medical compliance requirements.

⚖️

Healthcare-grade security & compliance

An ISO 27001-certified architecture. Self-hosted Gemma can be configured so data is never written to logs — ideal for pharma, hospitals and other settings with strict security requirements.

📊

Real-time usage dashboard

See token usage, cost and model breakdown for every API key at a glance. Manage multiple keys across teams and split billing by department.

🏗️

Enterprise fine-tuning support

Need medical-domain fine-tuning? Aiii offers a custom Gemma fine-tuning service — your training data stays in your environment and model weights can be kept private.

Models & Pricing
Choose the right model, pay by usage

During Beta, cloud models are 5–8% off and self-hosted Taiwan models are even lower. Full pricing will be announced at general availability.

Beta pricing · lock in the discount before launch
Cloud LLM providers (5–8% off)
AWS Bedrock
Claude Opus 4.8 · Sonnet 4.6 · Llama 4
8% off
vs. official
−8%
official pricing
Strength
Long-form · reasoning
Claude family
Best for
Enterprise KB
Compliant document processing
claude-opus-4-8 claude-sonnet-4-6 claude-haiku-4-5 llama-4-maverick
Google Vertex AI
Gemini 2.5 Pro · 2.5 Flash · Med-Gemini
5% off
vs. official
−5%
official pricing
Strength
Multimodal
2M-token long context
Best for
Image + text
Search-augmented RAG
gemini-2.5-pro gemini-2.5-flash med-gemini
OpenAI
GPT-5 · o4-mini · Whisper · gpt-oss
5% off
vs. official
−5%
official pricing
Strength
Tool calling
Broad ecosystem compatibility
Best for
Agent development
Speech transcription
gpt-5 o4-mini gpt-oss-120b whisper
Medical-specialized LLMs (clinical · compliance use cases)
Medical large language models
Med-Gemini · Meditron 70B · Med-PaLM 2
Medical fine-tuned
Strength
Clinical reasoning
Label & patient-education Q&A
Data sovereignty
On-prem capable
Sensitive records stay onshore
Best for
Pharma / hospitals
Compliance audit trail
med-gemini meditron-70b medpalm-2 aiii-med-gemma4
Open-source & Taiwan-local LLMs
International open-source flagships
Llama 4 · Gemma 4 · Phi-4 · Mistral
On-prem capable
License
Open weights
Self-host & fine-tune
Strength
Cost optimization
Self-hosted inference
Best for
High-volume inference
Private deployment
llama-4-maverick llama-4-scout gemma-4-31b phi-4-reasoning magistral-small gpt-oss-120b qwen
Taiwan-local models (data sovereignty)
TAIDE · FFM · FoxBrain · Taiwan-LLM
Local
Language
Trad. Chinese
Trained on Taiwan context
Sovereignty
Taiwan data center
Data stays onshore
Best for
Government / healthcare
Local compliance needs
llama-3.1-taide-8b llama3.1-ffm-70b foxbrain-70b llama-3-taiwan-70b
Aiii self-hosted compute (Taiwan)
How billing works

Prepaid credits billed by actual token usage. No monthly fee, no minimum spend; during Beta the minimum top-up is NT$30,000.

Apply for Beta →
Self-hosted Taiwan compute

Mac Studio M3 Ultra cluster
built in Taiwan, data stays onshore

A GPU cluster built by Aiii, using Apple M3 Ultra and M5 Max as inference nodes, hosted in a Taiwan data center. Ideal for healthcare, pharma and government use cases with data-sovereignty requirements.

Compute cores
M3 Ultra · M5 Max
Deployment scale
Continuously expanding
Data sovereignty
Taiwan data center, onshore
Supported models
Gemma 4, Llama 4
Compliance
ISO 27001
On-prem deployment
On-prem options available
📍 Taiwan data center · low-latency inference
💻
M3 Ultra #01
💻
M3 Ultra #02
💻
M3 Ultra #03
💻
M3 Ultra #04
💻
M3 Ultra #05
💻
M3 Ultra #06
📓
M5 Max #07
📓
M5 Max #08
📓
M5 Max #09
📓
M5 Max #10
💻
#11 expanding
💻
#12 expanding
💻
#13 planned
💻
#14 planned
💻
#15 planned
50 nodes online · avg latency <120ms · Taiwan data center
Who it's for
Whether you're a developer or an enterprise, there's a plan that fits
👨‍💻

Developers / startups

Connect to the API fast, test multiple models and cut your cloud bill. Get started from NT$30,000 during Beta, with zero learning curve thanks to the OpenAI-compatible format.

API key ready to useMulti-model comparisonLow barrier to entry
🏢

Enterprise IT / AI adoption

Unified billing, per-department API key routing and exportable usage reports. No need to apply for separate AWS / GCP / OpenAI accounts — manage it all in one place.

Multi-key managementUnified billingUsage controls
💊

Pharma / healthcare organizations

Self-hosted Taiwan compute keeps data onshore and supports ISO 27001 / HIPAA compliance requirements. Pair it with the Aiii MCP Engine to deploy compliant medical AI directly.

Data stays onshoreISO 27001Medical complianceOn-prem available
Beta applications now open

Be among the first to try
AI compute from 8% off

Limited to the Beta period. After you apply, a consultant will reach out within 1–2 business days; once we confirm your needs, we'll enable API access and offer an exclusive discounted top-up plan.

Beta spots are limited; we'll be in touch within 1–2 business days after review. We never share your personal information with third parties.

🎉

Application submitted!

We'll email you within 1–2 business days and enable API access once we confirm your needs.
For anything urgent, contact [email protected]

Frequently asked questions
How are discounts calculated, and what are they compared against?
We use each provider's officially published pricing as the baseline, and the Prometheus discount is applied directly to the per-1M-token rate. We're still in integration testing during Beta; a full per-model pricing table will be published at general availability.
Is the API OpenAI-compatible? How much code do I need to change?
It's fully compatible with the OpenAI Chat Completions API. You only change two parameters — base_url and api_key — and the rest of your code stays the same. The OpenAI SDK in any language works directly.
How does self-hosted Taiwan Gemma differ from cloud models?
The main advantages of Aiii's self-hosted Gemma are: (1) data never leaves Taiwan; (2) it can be fine-tuned for medical Chinese; (3) pricing is lower than cloud models; and (4) it suits pharma and hospital use cases with data-sovereignty requirements. That said, larger cloud flagship models (such as GPT-5 and Claude Opus 4.8) still have the edge on complex tasks.
If I don't use up my credits, are they refundable?
The policy isn't finalized during Beta; we expect to offer options to extend the validity of your balance. The refund policy will be spelled out clearly in the terms at general availability. If you have concerns, please note them in your application or contact [email protected] directly.
Can it be deployed on-premises (on-prem)?
Yes. For pharma, hospitals, government agencies and other settings with strict security requirements, Aiii offers on-prem deployment of self-hosted Gemma. It requires assessing compute scale, operations needs and more — please contact our enterprise sales team to plan it.
How is this different from just opening AWS / GCP / OpenAI accounts directly?
Three differences: (1) Discounts: Prometheus's purchasing power means lower unit prices; (2) Unified management: one account, one bill, and unified control across multiple models and keys; (3) Self-hosted Taiwan option: cloud providers don't offer an inference option where data stays onshore, and Aiii Gemma fills that gap.