Comparing Elixir and Phoenix Performance with a Community-Driven OpenAI Client

As a developer exploring the capabilities of Elixir and Phoenix, I recently conducted a performance experiment using a community-driven OpenAI client. The goal was to evaluate how well Elixir, with its powerful concurrency model, handles high-volume API requests. The results were impressive, demonstrating Elixir's efficiency and robustness in managing concurrent tasks across multiple CPU cores.

Background

At CourseMojo, we are scaling an AI assistant teacher for public schools which provides real-time, intelligent responses to students' answers to open-ended questions. During our load testing, we identified that heavy CPU utilization was primarily due to calls to the OpenAI API. This prompted me to explore Elixir and Phoenix to see if they could offer a more efficient solution.

Project Setup

To set up the project, we used the following steps:

Create a New Elixir Project: Run the following command to create a new Elixir project and navigate into the directory:
```
mix local.hex
mix archive.install hex phx_new
mix phx.new openai_experiment --no-ecto
cd openai_experiment
```
Add Dependencies: Add the ex_openai library to your mix.exs file:
```
defp deps do
  [
    {:ex_openai, "~> 1.6"}
  ]
end
```
Run mix deps.get to fetch the dependencies.

Configure the Client: Set up your configuration in config/config.exs:

import Config

config :ex_openai,
  api_key: System.get_env("OPENAI_API_KEY")

# the ex_openai http client seems to use this.
config :hackney,
  pool_size: 1800,
  max_connections: 1800

Increase File Descriptor Limit: To handle more than 1000 concurrent connections, increase the file descriptor limit:
```
ulimit -n 4096
```
Implement the Concurrency Logic: We used Task.async_stream/3 to manage concurrency, allowing us to process tasks efficiently across multiple CPU cores.

Elixir Script

Here is the Elixir script lib/openai_experiment.ex used for the experiment:

defmodule OpenaiExperiment do
  alias ExOpenAI.Chat
  alias ExOpenAI.Components.ChatCompletionRequestUserMessage

  @max_retries 4
  @retry_delay 2000  # milliseconds
  @concurrency_limit 500

  def generate_chat_completion(index, attempt \\ 1) do
    msgs = [
      %ChatCompletionRequestUserMessage{role: :system, content: "You are a helpful assistant."},
      %ChatCompletionRequestUserMessage{role: :user, content: "What is the number #{index}?"}
    ]

    case Chat.create_chat_completion(msgs, "gpt-4o-mini") do
      {:ok, response} ->
        IO.inspect(response, label: "Response for number #{index}")
        {:ok, index}
      {:error, %HTTPoison.Error{reason: reason}} when attempt <= @max_retries ->
        IO.puts("HTTP error for number #{index}: #{inspect(reason)}, attempt #{attempt}")
        :timer.sleep(@retry_delay)
        generate_chat_completion(index, attempt + 1)
      {:error, reason} when attempt <= @max_retries ->
        IO.puts("Retrying number #{index} due to #{inspect(reason)}, attempt #{attempt}")
        :timer.sleep(@retry_delay)
        generate_chat_completion(index, attempt + 1)
      {:error, reason} ->
        IO.inspect(reason, label: "Final error for number #{index}")
        {:error, index, reason}
    end
  end

  def execute_chats_concurrently do
    {time, results} = :timer.tc(fn ->
      1..1800
      |> Task.async_stream(&generate_chat_completion/1, max_concurrency: @concurrency_limit, timeout: 30000)
      |> Enum.to_list()
    end)

    summarize_results(results)
    IO.puts("Total execution time: #{time / 1_000_000} seconds")
  end

  defp summarize_results(results) do
    successes = Enum.count(results, fn
      {:ok, {:ok, _index}} -> true
      _ -> false
    end)

    failures = Enum.count(results, fn
      {:ok, {:error, _index, _reason}} -> true
      _ -> false
    end)

    IO.puts("Summary:")
    IO.puts("Successful tasks: #{successes}")
    IO.puts("Failed tasks: #{failures}")
  end
end

# Run the test
#OpenaiExperiment.execute_chats_concurrently()

Compiling and Running the Example

Compile the Project: Ensure your project is compiled by running:
```
mix compile
```
Start the Elixir Interactive Shell: Launch the interactive Elixir shell with your project loaded:
```
iex -S mix
```
Run the Example: Once inside the interactive shell, execute the function to run the experiment:
```
OpenaiExperiment.execute_chats_concurrently()
```

Results

The results of the experiment were as follows:

Successful tasks: 1800
Failed tasks: 0
Total execution time: 23.436011 seconds

Analysis

The real time, which represents the total elapsed time, was significantly efficient for the Elixir script. Here are some key takeaways from the results:

Concurrency Handling: Elixir handled 1800 tasks efficiently, leveraging its lightweight process model with a concurrency limit of 500.
Multi-Core Utilization: Elixir utilized all 4 CPU cores, demonstrating its ability to efficiently distribute tasks across available resources.
Built-in Retries: The script included a retry mechanism to handle transient errors, ensuring robustness in task execution.

This experiment demonstrated that Elixir, with its concurrency model, is a strong contender for handling high-volume, concurrent API requests using a community-driven OpenAI client. Elixir's efficient concurrency handling makes it a superior choice for scenarios where performance and speed are critical.

If you’re working on a project that involves extensive use of the OpenAI API or any other high-volume API interactions, consider leveraging Elixir to maximize performance and efficiency. The results of this experiment highlight the potential gains in speed and responsiveness that can be achieved with the right choice of technology.

It's worth noting that the OpenAI client used in this experiment is community-driven, not official, which also impacts performance. Additionally, keep an eye on developments in Elixir and Phoenix, as they continue to evolve and improve.

This post was written with the help of GPT-4 using the Elixir ex_openai library.

Fri 30 August 2024

Tags: Code, elixir

Want comments on your site?

Remarkbox — is a free SaaS comment service which embeds into your pages to keep the conversation in the same place as your content. It works everywhere, even static HTML sites like this one!