Show HN：開源電話堆疊，用於 AI 語音代理（Twilio 替代方案）

Hacker News·4 個月前

一個名為 VectorlyApp/open-telephony-stack 的新開源專案，為 AI 語音電話應用提供了一個符合 HIPAA 標準、可自行搭建的 Twilio 替代方案，採用 Asterisk PBX 和 AWS Chime SIP trunking。

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

To see all available qualifiers, see our documentation.

HIPAA-eligible DIY Twilio alternative for voice AI telephone applications. Uses Asterisk PBX and AWS Chime SIP trunking.

License

Uh oh!

There was an error while loading. Please reload this page.

VectorlyApp/open-telephony-stack

Folders and files

Latest commit

History

Repository files navigation

HIPAA-eligible DIY Twilio alternative

For more context, see the blog release post: https://vectorly.app/blog/open-telephony-stack

This repository contains the complete infrastructure for building a HIPAA-eligible phone system using Asterisk and AWS Chime SDK SIP Media Application.

This is a production-ready alternative to Twilio that gives you full control over your telephony infrastructure while maintaining HIPAA eligibility. Includes a sample voice agent server for integrating with AI voice services (e.g., OpenAI Realtime API).

Why this matters

Twilio has limitations for healthcare applications:

This solution provides:

What this is

A complete and secure telephony system built to handle both inbound and outbound calls:

You bring your own AI. This just handles the phone infrastructure.

Who this is for

Use case examples:

Consider alternatives if:

Infrastructure requires time and maintenance. That's the trade-off.

Architecture

The system routes calls from AWS Chime Voice Connector through an Asterisk PBX server, which bridges RTP audio streams to a FastAPI shim server. The shim server converts the audio to WebSocket events compatible with Twilio's Media Streams API, allowing seamless integration with your voice AI services.

Port reference

Call flow

Here's what happens when someone calls your number:

Environment and dependencies with uv

This project uses uv to manage Python environments and pinned dependencies.

Setup

Files

Common commands

Updating dependencies

Add new library items to dependencies in pyproject.toml, and then:

Or, if you just want uv.lock to use the latest version of dependencies (including submodules):

Docker builds also install from pyproject.toml for consistency.

Architecture overview

Signal flow

Key components

1. Asterisk PBX (deployment/asterisk-server/)

The core telephony engine running in Docker:

Config Files:

2. Shim Server (src/servers/asterisk_shim_server.py)

FastAPI application that bridges Asterisk to your voice AI service:

3. Call Session Manager (src/ari/call_session.py)

The heart of the real-time audio processing:

4. ARI Supervisor (src/ari/asterisk_ari_supervisor.py)

Manages all active calls:

5. Voice Agent Server (src/servers/voice_agent_server.py)

An example FastAPI application that bridges the shim server to OpenAI Realtime API:

Quick start

Prerequisites

1. Set up AWS Chime SIP trunk

2. Configure DNS

Before setting up TLS certificates, you need to configure DNS so that AWS Chime can resolve your Asterisk server's hostname. Create an A record pointing your SIP subdomain to your EC2 instance's Elastic IP:

This DNS record must be in place before:

After creating the record, wait for DNS propagation (usually a few minutes, but can take up to 48 hours depending on TTL). You can verify with:

3. Install Asterisk server

4. Configure TLS certificates

AWS Chime requires TLS for SIP. Use Let's Encrypt:

5. Configure Asterisk

Edit deployment/asterisk-server/asterisk-config/pjsip.conf:

Edit deployment/asterisk-server/asterisk-config/extensions.conf:

6. Deploy shim server

7. Test the system

Call your AWS Chime phone number. You should see:

Configuration reference

Environment variables

Asterisk configuration files

pjsip.conf - SIP trunking to AWS Chime:

extensions.conf - Dialplan routing:

ari.conf - REST API access:

http.conf - HTTP server settings:

Real-time audio format

The WebSocket API is modeled after Twilio's Media Streams. If you've integrated with Twilio before, this will look familiar: same event structure, same audio format.

Start event (shim → voice server)

Media event (bidirectional)

Audio specs:

Clear event (voice server → shim)

Clears the audio buffer immediately. Used for barge-in / interruption handling.

Mark event (bidirectional)

Used for tracking audio playback position. The shim ACKs marks when the corresponding audio has actually been transmitted.

Stop event (either direction)

Architecture deep dive

Why this topology?

Asterisk is mature, stable, and handles SIP/RTP at scale. But it doesn't natively support AI voice interactions. By using ARI (Asterisk REST Interface) and ExternalMedia channels, we can:

RTP → WebSocket bridge

The shim server maintains two concurrent audio loops per call:

Loop 1: RTP → WebSocket (Caller Audio)

Loop 2: WebSocket → RTP (AI Audio)

This ensures perfect timing - Asterisk expects audio every 20ms, regardless of network jitter from the WebSocket connection.

ExternalMedia channels

Asterisk's ExternalMedia channel type creates a client-mode RTP connection:

Mixing bridge

Each call uses an ARI mixing bridge:

This allows future enhancements like:

Production considerations

Scaling

Vertical Scaling:

Horizontal Scaling:

Security

TLS Everywhere:

Firewall Rules (Security Groups):

The repo includes a Lambda function that automatically updates your security group when AWS publishes new IP ranges for AMAZON, EC2, and CHIME_VOICECONNECTOR services.

Credentials:

Monitoring

Health Endpoints:

Asterisk CLI:

Logs:

Troubleshooting

Call not connecting

Check Asterisk SIP registration:

Check ARI connectivity:

Check shim server status:

No audio

Check RTP ports are open:

Check ExternalMedia channel:

Enable RTP debugging:

High latency

Check network to AWS Chime:

Check CPU usage:

Reduce concurrent calls if CPU > 80%

File structure

Resources

Asterisk

AWS

Docker Hub

Frameworks

Background reading

About

HIPAA-eligible DIY Twilio alternative for voice AI telephone applications. Uses Asterisk PBX and AWS Chime SIP trunking.

你的個人知識庫

Show HN：開源電話堆疊，用於 AI 語音代理（Twilio 替代方案）

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

VectorlyApp/open-telephony-stack

Folders and files

Latest commit

History

Repository files navigation

HIPAA-eligible DIY Twilio alternative

Why this matters

What this is

Who this is for

Architecture

Port reference

Call flow

Environment and dependencies with uv

Setup

Files

Common commands

Updating dependencies

Architecture overview

Signal flow

Key components

1. Asterisk PBX (deployment/asterisk-server/)

2. Shim Server (src/servers/asterisk_shim_server.py)

3. Call Session Manager (src/ari/call_session.py)

4. ARI Supervisor (src/ari/asterisk_ari_supervisor.py)

5. Voice Agent Server (src/servers/voice_agent_server.py)

Quick start

Prerequisites

1. Set up AWS Chime SIP trunk

2. Configure DNS

3. Install Asterisk server

4. Configure TLS certificates

5. Configure Asterisk

6. Deploy shim server

7. Test the system

Configuration reference

Environment variables

Asterisk configuration files

Real-time audio format

Start event (shim → voice server)

Media event (bidirectional)

Clear event (voice server → shim)

Mark event (bidirectional)

Stop event (either direction)

Architecture deep dive

Why this topology?

RTP → WebSocket bridge

ExternalMedia channels

Mixing bridge

Production considerations

Scaling

Security

Monitoring

Troubleshooting

Call not connecting

No audio

High latency

File structure

Resources

Asterisk

AWS

Docker Hub

Frameworks

Background reading

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks