Prometheus Beta · Now in private preview

Bringing compute
to humanity

Name: Prometheus AI Compute Platform
Brand: Aiii

Prometheus · AI Compute Platform

AWS Bedrock, Google Vertex AI, OpenAI and self-hosted Taiwan compute in one place — a single gateway to the best model, with AI compute from 8% off on cloud, usage-based billing and no lock-in.

Apply for Beta → View pricing

From 8% offvs. official pricing

4+provider integrations

Self-hostedTaiwan compute · stays onshore

UnifiedOpenAI-compatible API

🔥

Pick a model, infer instantly

Limited-time Beta discounts

aws

AWS Bedrock · Claude Opus 4.8

Flagship reasoning, long-context optimized

8% off

GCP

Google Vertex AI · Gemini 2.5 Pro

Multimodal, long-context understanding

5% off

GPT

OpenAI · GPT-5 / o4-mini

Broad compatibility, tool calling

5% off

⚕️

Medical-specialized · Med-Gemini / Meditron 70B

Medical Q&A · clinical-reasoning fine-tuned

5% off

Aiii Gemma 4 Medical (self-hosted in Taiwan)

Taiwan data center · medical fine-tuning

Best price

Unified OpenAI-compatible API — switch providers by changing one line of base_url

💰

Up to 8% off — direct savings

We pool purchasing power across providers and pass the discount straight to you. During Beta you get our best pricing, with no minimum spend.

🔌

One API, four model families

A unified, OpenAI-compatible interface. Switching providers takes one line of base_url — your code stays the same and billing is automatically consolidated.

🇹🇼

Local Taiwan compute

A cluster of Mac Studio M3 Ultra and MacBook Pro M5 Max units hosted in a Taiwan data center. Low-latency local inference, data that stays onshore, and alignment with medical compliance requirements.

⚖️

Healthcare-grade security & compliance

An ISO 27001-certified architecture. Self-hosted Gemma can be configured so data is never written to logs — ideal for pharma, hospitals and other settings with strict security requirements.

📊

Real-time usage dashboard

See token usage, cost and model breakdown for every API key at a glance. Manage multiple keys across teams and split billing by department.

🏗️

Enterprise fine-tuning support

Need medical-domain fine-tuning? Aiii offers a custom Gemma fine-tuning service — your training data stays in your environment and model weights can be kept private.

Models & Pricing

Choose the right model, pay by usage

During Beta, cloud models are 5–8% off and self-hosted Taiwan models are even lower. Full pricing will be announced at general availability.

Beta pricing · lock in the discount before launch

Cloud LLM providers (5–8% off)

aws

AWS Bedrock

Claude Opus 4.8 · Sonnet 4.6 · Llama 4

8% off

vs. official

−8%

official pricing

Strength

Long-form · reasoning

Claude family

Best for

Enterprise KB

Compliant document processing

claude-opus-4-8 claude-sonnet-4-6 claude-haiku-4-5 llama-4-maverick

GCP

Google Vertex AI

Gemini 2.5 Pro · 2.5 Flash · Med-Gemini

5% off

vs. official

−5%

official pricing

Strength

Multimodal

2M-token long context

Best for

Image + text

Search-augmented RAG

gemini-2.5-pro gemini-2.5-flash med-gemini

GPT

OpenAI

GPT-5 · o4-mini · Whisper · gpt-oss

5% off

vs. official

−5%

official pricing

Strength

Tool calling

Broad ecosystem compatibility

Best for

Agent development

Speech transcription

gpt-5 o4-mini gpt-oss-120b whisper

Medical-specialized LLMs (clinical · compliance use cases)

⚕️

Medical large language models

Med-Gemini · Meditron 70B · Med-PaLM 2

Medical fine-tuned

Strength

Clinical reasoning

Label & patient-education Q&A

Data sovereignty

On-prem capable

Sensitive records stay onshore

Best for

Pharma / hospitals

Compliance audit trail

med-gemini meditron-70b medpalm-2 aiii-med-gemma4

Open-source & Taiwan-local LLMs

OSS

International open-source flagships

Llama 4 · Gemma 4 · Phi-4 · Mistral

On-prem capable

License

Open weights

Self-host & fine-tune

Strength

Cost optimization

Self-hosted inference

Best for

High-volume inference

Private deployment

llama-4-maverick llama-4-scout gemma-4-31b phi-4-reasoning magistral-small gpt-oss-120b qwen

🇹🇼

Taiwan-local models (data sovereignty)

TAIDE · FFM · FoxBrain · Taiwan-LLM

Local

Language

Trad. Chinese

Trained on Taiwan context

Sovereignty

Taiwan data center

Data stays onshore

Best for

Government / healthcare

Local compliance needs

llama-3.1-taide-8b llama3.1-ffm-70b foxbrain-70b llama-3-taiwan-70b

Aiii self-hosted compute (Taiwan)

🔥

Aiii Gemma (self-hosted in Taiwan)

Taiwan data center · data stays onshore · medical fine-tuned

Best price

vs. cloud

Lower cost

Self-hosted advantage

Strength

Data sovereignty

Onshore · compliant

Best for

Healthcare / pharma

Sensitive-data inference

gemma-4-31b-medical gemma-4-e4b-fast aiii-embed-zh

How billing works

Prepaid credits billed by actual token usage. No monthly fee, no minimum spend; during Beta the minimum top-up is NT$30,000.

Apply for Beta →

Self-hosted Taiwan compute

Mac Studio M3 Ultra cluster
built in Taiwan, data stays onshore

A GPU cluster built by Aiii, using Apple M3 Ultra and M5 Max as inference nodes, hosted in a Taiwan data center. Ideal for healthcare, pharma and government use cases with data-sovereignty requirements.

Compute cores

M3 Ultra · M5 Max

Deployment scale

Continuously expanding

Data sovereignty

Taiwan data center, onshore

Supported models

Gemma 4, Llama 4

Compliance

ISO 27001

On-prem deployment

On-prem options available

📍 Taiwan data center · low-latency inference

💻

M3 Ultra #01

💻

M3 Ultra #02

💻

M3 Ultra #03

💻

M3 Ultra #04

💻

M3 Ultra #05

💻

M3 Ultra #06

📓

M5 Max #07

📓

M5 Max #08

📓

M5 Max #09

📓

M5 Max #10

💻

#11 expanding

💻

#12 expanding

💻

#13 planned

💻

#14 planned

💻

#15 planned

50 nodes online · avg latency <120ms · Taiwan data center

Who it's for

Whether you're a developer or an enterprise, there's a plan that fits

👨‍💻

Developers / startups

Connect to the API fast, test multiple models and cut your cloud bill. Get started from NT$30,000 during Beta, with zero learning curve thanks to the OpenAI-compatible format.

API key ready to useMulti-model comparisonLow barrier to entry

🏢

Enterprise IT / AI adoption

Unified billing, per-department API key routing and exportable usage reports. No need to apply for separate AWS / GCP / OpenAI accounts — manage it all in one place.

Multi-key managementUnified billingUsage controls

💊

Pharma / healthcare organizations

Self-hosted Taiwan compute keeps data onshore and supports ISO 27001 / HIPAA compliance requirements. Pair it with the Aiii MCP Engine to deploy compliant medical AI directly.

Data stays onshoreISO 27001Medical complianceOn-prem available

Beta applications now open

Be among the first to try
AI compute from 8% off

Limited to the Beta period. After you apply, a consultant will reach out within 1–2 business days; once we confirm your needs, we'll enable API access and offer an exclusive discounted top-up plan.

Name *

Company *

Work email *

Phone *

Intended use case *

Additional notes (optional)

Beta spots are limited; we'll be in touch within 1–2 business days after review. We never share your personal information with third parties.

🎉

Application submitted!

We'll email you within 1–2 business days and enable API access once we confirm your needs.
For anything urgent, contact [email protected]

Frequently asked questions

How are discounts calculated, and what are they compared against?

We use each provider's officially published pricing as the baseline, and the Prometheus discount is applied directly to the per-1M-token rate. We're still in integration testing during Beta; a full per-model pricing table will be published at general availability.

Is the API OpenAI-compatible? How much code do I need to change?

It's fully compatible with the OpenAI Chat Completions API. You only change two parameters — base_url and api_key — and the rest of your code stays the same. The OpenAI SDK in any language works directly.

How does self-hosted Taiwan Gemma differ from cloud models?

The main advantages of Aiii's self-hosted Gemma are: (1) data never leaves Taiwan; (2) it can be fine-tuned for medical Chinese; (3) pricing is lower than cloud models; and (4) it suits pharma and hospital use cases with data-sovereignty requirements. That said, larger cloud flagship models (such as GPT-5 and Claude Opus 4.8) still have the edge on complex tasks.

If I don't use up my credits, are they refundable?

The policy isn't finalized during Beta; we expect to offer options to extend the validity of your balance. The refund policy will be spelled out clearly in the terms at general availability. If you have concerns, please note them in your application or contact [email protected] directly.

Can it be deployed on-premises (on-prem)?

Yes. For pharma, hospitals, government agencies and other settings with strict security requirements, Aiii offers on-prem deployment of self-hosted Gemma. It requires assessing compute scale, operations needs and more — please contact our enterprise sales team to plan it.

How is this different from just opening AWS / GCP / OpenAI accounts directly?

Three differences: (1) Discounts: Prometheus's purchasing power means lower unit prices; (2) Unified management: one account, one bill, and unified control across multiple models and keys; (3) Self-hosted Taiwan option: cloud providers don't offer an inference option where data stays onshore, and Aiii Gemma fills that gap.

Bringing computeto humanity

Up to 8% off — direct savings

One API, four model families

Local Taiwan compute

Healthcare-grade security & compliance

Real-time usage dashboard

Enterprise fine-tuning support

Mac Studio M3 Ultra clusterbuilt in Taiwan, data stays onshore

Developers / startups

Enterprise IT / AI adoption

Pharma / healthcare organizations

Be among the first to tryAI compute from 8% off

Application submitted!

Bringing compute
to humanity

Mac Studio M3 Ultra cluster
built in Taiwan, data stays onshore

Be among the first to try
AI compute from 8% off