Neuron Instance โ
The object returned by createNeuron(). Provides methods for chatting, state management, and lifecycle control.
Methods โ
send(message): AsyncGenerator<string> โ
Stream tokens from the LLM as an async generator.
for await (const token of neuron.send('Hello!')) {
element.textContent += token
}- Adds the message and response to conversation history automatically
- Throws if model isn't loaded yet or already generating
- Use
neuron.isLoadingto check if ready
complete(message): Promise<string> โ
Get the full response as a single string. Convenience wrapper around send().
const reply = await neuron.complete('Tell me a joke')
console.log(reply)stop(): void โ
Stop generation mid-stream. The send() async generator will exit gracefully.
neuron.stop()setModel(modelId): Promise<void> โ
Switch to a different model. Downloads if not cached. Reuses the existing worker.
await neuron.setModel('Llama-3.2-3B-Instruct-q4f16_1-MLC')setPersonalityDocs(docs): void โ
Replace personality documents at runtime. Takes effect on the next send() call.
neuron.setPersonalityDocs([
{ type: 'zero-shot', content: 'You are now a chef.' },
{ type: 'knowledge', content: 'You specialize in French cuisine.' },
])setSystemPrompt(prompt): void โ
Replace the system prompt entirely. Clears any personality docs.
neuron.setSystemPrompt('You are a helpful assistant.')setTemperature(value): void โ
Update sampling temperature for subsequent send() calls. No engine rebuild โ the new value applies on the next message.
neuron.setTemperature(0.1) // strict, factual
neuron.setTemperature(0.8) // creative, more variationsetMaxTokens(value): void โ
Update the per-response token cap. Takes effect on the next send().
neuron.setMaxTokens(2048)setFrequencyPenalty(value): void โ
Update repetition discouragement (0.0โ2.0, default 0.5). Higher values reduce repeated phrases.
neuron.setFrequencyPenalty(0.7)setMaxHistoryTurns(value): void โ
Update how many recent turns are sent in subsequent prompts. Doesn't truncate stored history โ only changes what's included in future requests.
neuron.setMaxHistoryTurns(20)Sampler hot-swap (0.3.0+)
The four set* methods above all hot-swap without touching the engine โ ideal for live settings UIs where users tweak temperature mid-chat. For per-instance routing where the model changes, use setModel() which reloads weights but reuses the worker.
getHistory(): Array<{ role, content }> โ
Get a copy of the conversation history.
const history = neuron.getHistory()
// [{ role: 'user', content: 'Hi' }, { role: 'assistant', content: 'Hello!' }]
// Save to localStorage
localStorage.setItem('chat', JSON.stringify(history))setHistory(messages): void โ
Restore conversation history from a saved state.
const saved = JSON.parse(localStorage.getItem('chat') || '[]')
neuron.setHistory(saved)clearHistory(): void โ
Clear all conversation history. The next send() starts a fresh conversation.
neuron.clearHistory()destroy(): void โ
Terminate the worker, release resources. Call when unmounting a component or leaving a page.
// In Vue
onUnmounted(() => neuron.destroy())
// In React
useEffect(() => () => neuron.destroy(), [])Properties โ
All properties are read-only.
isLoading: boolean โ
true while the model is downloading or initializing. Check before calling send().
if (!neuron.isLoading) {
for await (const token of neuron.send('Hi')) { ... }
}isGenerating: boolean โ
true while the LLM is actively generating a response.
if (neuron.isGenerating) {
showStopButton()
}loadProgress: { percent: number, text: string } โ
Current model loading progress.
// Use with onProgress callback for reactive updates
createNeuron({
onProgress: (pct, text) => {
progressBar.style.width = `${pct}%`
statusText.textContent = text
},
})
// Or poll (less ideal)
console.log(neuron.loadProgress) // { percent: 45, text: "Loading params_shard_3..." }