The following code:
from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
import torch
audio_proc = Wav2Vec2Processor.from_pretrained(
"vitouphy/wav2vec2-xls-r-300m-timit-phoneme"
)
audio_model = Wav2Vec2ForCTC.from_pretrained(
"vitouphy/wav2vec2-xls-r-300m-timit-phoneme"
)
def compute_phonemes(audio_content: bytes) -> list[dict[str, Any]]:
try:
audio_file = io.BytesIO(audio_content)
speech, sr = librosa.load(audio_file, sr=16000)
except Exception as e:
print("Error loading speech", e)
return []
model_inputs = audio_proc(
speech, sampling_rate=sr, return_tensors="pt", padding=True
)
with torch.no_grad():
logits = audio_model(**model_inputs).logits
works completely fine on Github Codespaces 2CPU instance and runs in about .4 seconds
However when the exact same code, the exact same API, is deployed on Render — which usually works like a charm — then I got the following error logs, systematically:
[2025-01-09 18:25:30 +0000] [98] [CRITICAL] WORKER TIMEOUT (pid:141)
[2025-01-09 18:25:31 +0000] [98] [ERROR] Worker (pid:141) was sent code 134!
I traced down the issue and this is exactly this line that triggers the SIGABRT signal (code 134):
logits = audio_model(**model_inputs).logits
The same code will run "fine" on Railway but take 10s instead of 0.4s on Github Codespaces.
I can't see any difference in configuration between the two environments. Same Python version (3.12) and same PyTorch version (2.2.0)
Any idea or pointers to a solution (or an alternative) would be welcome.