CPU Inference in your browser with Candle

Idle.
Sampling settings

Set top_k to 0 to disable it. When both top-k and top-p are set, top-k filtering runs first, then top-p.

Output

TTFT: tok/s: tokens: