Converting to Remotorch

Your PyTorch code stays almost identical. The only thing that changes is where it runs.

What You're Leaving Behind

All of this complexity? Gone.

CUDA Drivers

No more matching CUDA versions to PyTorch versions to driver versions. We handle all of that.

Docker + nvidia-docker

No 15GB GPU container images. No nvidia-container-toolkit. No docker-compose GPU configs.

GPU Hardware

No $2,000 graphics card. No cloud GPU instances sitting idle. No hardware maintenance.

Complex Deployment

No Kubernetes GPU scheduling. No CUDA capability checks. No "works on my machine" GPU issues.

The Traditional Way

Deploying GPU-Powered Apps Today

Here's what you typically need to get PyTorch running in production

Infrastructure Requirements

requirements.txt + infrastructure
# Python dependencies
torch==2.1.0+cu121
torchvision==0.16.0+cu121
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvtx-cu12==12.1.105
triton==2.1.0

# System requirements (not in pip)
# - NVIDIA Driver >= 530.30.02
# - CUDA Toolkit 12.1
# - cuDNN 8.9
# - Ubuntu 20.04+ or similar
# - nvidia-container-toolkit
# - 16GB+ RAM recommended

Dockerfile

Dockerfile
FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04

# Install Python and dependencies
RUN apt-get update && apt-get install -y \
    python3.10 python3-pip \
    libgl1-mesa-glx libglib2.0-0 \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Image size: ~15GB
CMD ["python3", "app.py"]

docker-compose.yml

docker-compose.yml
version: '3.8'
services:
  app:
    build: .
    runtime: nvidia  # Requires nvidia-container-toolkit
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
The Remotorch Way

Same App, Zero GPU Infrastructure

Here's the entire setup with Remotorch

Infrastructure Requirements

requirements.txt
remotorch
flask  # or whatever else your app needs

# That's it. No CUDA. No nvidia packages.
# No GPU drivers. No special hardware.

# Works on:
# - $6/month VPS
# - Raspberry Pi
# - MacBook Air
# - AWS Lambda
# - Literally anything with Python 3.8+

Dockerfile (Optional)

Dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Image size: ~150MB (not 15GB!)
CMD ["python", "app.py"]
100x
Smaller Docker image
150MB vs 15GB
0
NVIDIA dependencies
No drivers, no CUDA, no cuDNN
Any
Hosting works
No GPU instance required

Code Comparison

See what actually changes in your Python code

Example 1: Basic Tensor Operations

BEFORE Traditional PyTorch
import torch

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Create tensors on GPU
x = torch.randn(1000, 1000, device=device)
y = torch.randn(1000, 1000, device=device)

# Matrix multiplication
result = torch.matmul(x, y)

# Get result back to CPU
output = result.cpu().numpy()
AFTER With Remotorch
import remotorch

# Connect to remote GPU
remotorch.connect(
    api_key="rk_...",
    gpu_type="rtx4090"  # Choose your GPU
)  # <- only new lines

# Create tensors on remote GPU
x = remotorch.randn(1000, 1000)
y = remotorch.randn(1000, 1000)

# Matrix multiplication (runs remotely)
result = remotorch.matmul(x, y)

# Get result back to local
output = result.cpu().numpy()

Example 2: Image Classification API

BEFORE Requires GPU server
import torch
import torchvision.models as models
from flask import Flask, request

app = Flask(__name__)

# Load model on GPU (requires CUDA)
device = torch.device("cuda")
model = models.resnet50(pretrained=True)
model = model.to(device)
model.eval()

@app.route("/classify", methods=["POST"])
def classify():
    img_tensor = preprocess(request.files["image"])
    img_tensor = img_tensor.to(device)

    with torch.no_grad():
        output = model(img_tensor)

    return {"class": output.argmax().item()}

Requires: GPU instance ($100-300/mo), CUDA drivers, nvidia-docker

AFTER Runs anywhere
import remotorch
from flask import Flask, request

app = Flask(__name__)

# Connect to remote GPU
remotorch.connect(api_key="rk_...", gpu_type="rtx4090")

# Load model on remote GPU
model = remotorch.hub.load("resnet50", pretrained=True)
model.eval()

@app.route("/classify", methods=["POST"])
def classify():
    img_tensor = preprocess(request.files["image"])
    img_tensor = remotorch.tensor(img_tensor)

    with remotorch.no_grad():
        output = model(img_tensor)

    return {"class": output.argmax().cpu().item()}

Requires: Any Python host ($6/mo VPS works!)

Example 3: Batch Processing Script

BEFORE Local GPU required
import torch

# This script only works if you have a GPU
if not torch.cuda.is_available():
    raise RuntimeError("No GPU found!")

device = torch.device("cuda")

def process_batch(data):
    tensor = torch.tensor(data, device=device)

    # Heavy computation
    result = tensor @ tensor.T
    result = torch.nn.functional.softmax(result, dim=-1)

    return result.cpu().numpy()
AFTER No local GPU needed
import remotorch

# Works from any machine
remotorch.connect(api_key="rk_...", gpu_type="rtx4090")

def process_batch(data):
    tensor = remotorch.tensor(data)

    # Heavy computation (runs on remote GPU)
    result = tensor @ tensor.T
    result = remotorch.nn.functional.softmax(result, dim=-1)

    return result.cpu().numpy()

The Complete Change List

Here's everything you need to change in your code

Traditional PyTorch Remotorch
import torch import remotorch
- remotorch.connect(api_key="...", gpu_type="rtx4090")
torch.tensor(...) remotorch.tensor(...)
torch.randn(...) remotorch.randn(...)
torch.matmul(a, b) remotorch.matmul(a, b)
tensor.to("cuda") Not needed - already on GPU
tensor.cuda() Not needed
tensor.cpu() tensor.cpu() (same)

That's it. Really.

Change your imports from torch to remotorch, add one connect line, and you're done. Your tensor operations, model inference, and everything else works the same way.

Migration Checklist

Convert your project in 5 minutes

1

Install Remotorch

pip install remotorch
2

Get your API key

Sign up at remotorch.com and create an API key from your dashboard

3

Find & replace imports

Replace import torch with import remotorch

4

Add connect call

Add remotorch.connect(api_key="...", gpu_type="rtx4090") at the start of your script

5

Remove GPU infrastructure

Delete your Dockerfile's CUDA base image, nvidia-docker config, and GPU instance. Deploy to any cheap host.

Ready to Ditch the GPU Headaches?

Start running inference from any device in minutes.

Signups temporarily disabled View GPU Pricing