Frontmatter of the page which consists of an AI generated image of a chatbot.

Tutorial: Chatbot in AstroJS with CloudFlare Workers AI (Part 2) - Coding the backend

In this part of the tutorial, we will create a new API route in AstroJS to handle the requests from the client. To streamline the process, we’ll create a single endpoint that accepts HTTP POST requests. This endpoint will receive the user’s message, forward it to the AI model, process the AI’s response, and return it to the client as a ReadableStream.

Let’s create a new file named chatbot.ts within the src/pages/api/ directory. To begin, we’ll define the POST endpoint and handle the incoming client request:

src/pages/api/chatbot.ts
1
import type { APIContext } from "astro";
2
export const prerender = false;
3
4
export async function POST({ request, locals }: APIContext) {
5
const payload = (await request.json()) as RoleScopedChatInput[];
6
7
let messages: RoleScopedChatInput[] = [
8
{ role: "system", content: "You are a friendly assistant" },
9
];
10
messages = messages.concat(payload);
11
...
12
};

We’ll extract the messages from the incoming request, parse them as a JSON object, and combine them with a system message (e.g., “You are a friendly assistant”). These messages will be formatted as a RoleScopedChatInput[] array, a CloudFlare-specific type for model interactions. This type should be available to you if you correctly configured the project as specified in the previous part of this tutorial.

To interact with the CloudFlare Model API, we’ll execute the following code:

src/pages/api/chatbot.ts
1
export async function POST({ request, locals }: APIContext) {
2
...
3
const { AI } = locals.runtime.env;
4
let eventSourceStream: ReadableStream<Uint8Array> | undefined;
5
let retryCount = 0;
6
let successfulInference = false;
7
let lastError;
8
const MAX_RETRIES = 3;
9
while (successfulInference === false && retryCount < MAX_RETRIES) {
10
try {
11
eventSourceStream = (await AI.run("@cf/meta/llama-3-8b-instruct-awq", {
12
stream: true,
13
messages,
14
})) as ReadableStream<Uint8Array>;
15
successfulInference = true;
16
} catch (err) {
17
lastError = err;
18
retryCount++;
19
console.error(err);
20
console.log(`Retrying #${retryCount}...`);
21
}
22
}
23
if (eventSourceStream === undefined) {
24
if (lastError) {
25
throw lastError;
26
}
27
throw new Error(`Problem with model`);
28
}
29
...
30
}

To access the AI model we follow the next steps:

  1. Environment Variable Retrieval: We leverage the locals object to access the previously configured environment variable named AI (refer to the prior tutorial section for setup details). This variable allows us to connect to the CloudFlare AI model.

  2. Error Handling: The code incorporates error-catching mechanisms to handle potential issues during AI model invocation. This is specially relevant when using a beta CloudFlare AI model.

  3. Query to the model (Lines 11-14): These crucial lines make the actual API call to the CloudFlare AI model. Notably, the stream argument is set to true, indicating that the response should be returned as a ReadableStream.

The last step consists of preprocess the data a bit before sending it back to the client:

src/pages/api/chatbot.ts
1
export async function POST({ request, locals }: APIContext) {
2
...
3
const response: ReadableStream = new ReadableStream({
4
start(controller) {
5
eventSourceStream.pipeTo(
6
new WritableStream({
7
write(chunk) {
8
const text = new TextDecoder().decode(chunk);
9
for (const line of text.split("\n")) {
10
if (!line.includes("data: ")) {
11
continue;
12
}
13
if (line.includes("[DONE]")) {
14
controller.close();
15
break;
16
}
17
try {
18
const data = JSON.parse(line.split("data: ")[1]);
19
controller.enqueue(new TextEncoder().encode(data.response));
20
} catch (err) {
21
console.error("Could not parse response", err);
22
}
23
}
24
},
25
})
26
);
27
28
request.signal.addEventListener("abort", () => {
29
controller.close();
30
});
31
},
32
});
33
34
return new Response(response, {
35
headers: {
36
"content-type": "text/event-stream",
37
"Cache-Control": "no-cache",
38
"Access-Control-Allow-Origin": "*",
39
"Connection": "keep-alive",
40
},
41
});
42
}

Each ReadableStream chunk consists of a string formatted as data: {"response": "Hello, how are you?", "p": "abcdeaf.."}. The stream concludes with a data: [DONE] message. Our code extracts the “response” field from each chunk, encodes it, and transmits it to the client.

With this final data processing step, we’ve successfully completed the backend portion of our chatbot project. The next and final tutorial part will focus on creating the Astro component responsible for receiving the server response and displaying it within the browser.