I fixed a memory leak of Metal backend of Leela Chess Zero (lc0). Let’s see how I did it.

Get lc0 Metal backend source code

The lc0 Metal backend is developed in almaudoh’s new-metal-backend-mpsgraph branch. I cloned the source code with the following commands.

git clone -b new-metal-backend-mpsgraph --recurse-submodules https://github.com/almaudoh/lc0.git lc0-almaudoh
cd lc0-almaudoh
git checkout 055e19e4081a32fdb763569375eaa05c7a2be37f

It will create a subdirectory lc0-almaudoh in the current directory, and download the source code to lc0-almaudoh/. Then, it will checkout the revision 055e19e4081a32fdb763569375eaa05c7a2be37f that has a memory leak.

Generate an Xcode project from lc0

I detect memory leaks by Xcode Instruments, so I need to generate an Xcode project from lc0. The commands are shown as below.

cd lc0-almaudoh
meson setup build --backend xcode

It will create a subdirectory build in the lc0-almaudoh directory, and generate Xcode project files into build/lc0.xcodeproj.

Configure lc0 Xcode project

The generated Xcode project has not been configured with:

Scheme of lc0 executable
Argument passed on launch

Therefore, I manually select the scheme of lc0 executable, and specify a launch argument. The steps are shown as follows.

In Terminal, open the generated Xcode project by the following command:

open build/lc0.xcodeproj

It will open Xcode graphical user interface (GUI).

Scheme of lc0 executable

In Xcode GUI, click the following items.

Product -> Scheme -> lc0@exe

Xcode-Product Xcode-Product-Scheme Xcode-Product-Scheme-lc0

It selects the scheme of lc0 executable.

Argument passed on launch

In Xcode GUI, click the following items.

Product -> Scheme -> Edit Scheme…

Xcode-Product-Edit-Scheme

Run -> Arguments -> +

Xcode-Arguments-Add-Items

Enter “-b metal” -> Close

Xcode-Argument-Metal

It will set lc0’s backend to Metal.

Hack universal chess interface

The memory leak can be easily detected by running lc0 infinitely. Unfortunately, Xcode Instruments cannot run lc0 infinitely because of the following facts:

The lc0 waits for commands from standard input stream.
Xcode Instruments does not seem to allow a user entering commands to standard input stream.
Passing a command file to standard input stream makes lc0 terminate after lc0 processed all commands in the command file.

Therefore, I hacked universal chess interface, so that lc0 can receive a command that makes lc0 search infinitely on launch.

I modified the source code as follows.

src/chess/uciloop.cc:132:
void UciLoop::RunLoop() {
  std::cout.setf(std::ios::unitbuf);
  std::string line;

  line = "go infinite";
  LOGFILE << ">> " << line;
  try {
    auto command = ParseCommand(line);
    DispatchCommand(command.first, command.second);
  } catch (Exception& ex) {
    SendResponse(std::string("error ") + ex.what());
  }

  while (std::getline(std::cin, line)) {
    LOGFILE << ">> " << line;
    try {
      auto command = ParseCommand(line);
      // Ignore empty line.
      if (command.first.empty()) continue;
      if (!DispatchCommand(command.first, command.second)) break;
    } catch (Exception& ex) {
      SendResponse(std::string("error ") + ex.what());
    }
  }
}

It will let lc0 process an additional command go infinite before reading data from standard input stream.

Build lc0 executable

The lc0 executable is ready to be built from the source code. I built lc0 executable by the following steps.

Product -> Build

Xocde-Product-Build

It will build lc0 executable to build/debug/lc0.

Download leela chess zero network

The lc0 evaluates a chess move by neural network, so I need to download a neural network to let lc0 use. The steps are shown as below.

In Safari, download this file: http://training.lczero.org/get_network?sha=82d14d7d8a4f00826f269901d5e31df1a7b2112c20604dc8bee4008271db4d88.
Move the downloaded file to the directory build/debug/.

It downloads last T60 320x24 network 606511 to the directory that contains lc0 executable, so lc0 can find the network.

Run Xcode Instruments

Xcode project has been configured correctly. And, lc0 can run Metal backend with neural network infinitely. Now, I can run Xcode Instruments to find memory leaks of Metal backend of lc0. The steps are shown as follows.

Product -> Profile

All -> Leaks -> Choose

Record

Xcode-Leaks-Record

It runs lc0 and collect information of memory allocations and leaks. I saw the result as follows.

Xcode-Leaks-Tree

It shows that memory usage is infinitely increasing, and the function runInferenceWithBatchSize has used 1366.22 MB. I double-clicked the function runInferenceWithBatchSize to see more details inside it.

Xcode-Leaks-Tree-Function

It shows that 100% memory is using by the following source code.

    _resultDataDictionary = [_graph runWithFeeds:@{_inputTensor : _inputTensorData}
                                   targetTensors:_targetTensors
                                targetOperations:nil];

Fix memory leak

In the previous section, we know which source code generates a memory leak. In this section, I will fix the memory leak.

The memory leak can be fixed by my pull request, in which I made the following changes.

Release _inputTensorData explicitly.
Surround the function that has memory leaks by an autorelease pool.

To checkout my pull request locally, we can follow the steps below.

In Terminal, enter the following commands:

git stash
git fetch origin pull/2/head:fix-memory-leak
git checkout fix-memory-leak
git stash pop

It will stash changes in the working directory. Then, it will fetch the pull request to a new branch fix-memory-leak. Finally, it will checkout the branch fix-memory-leak, and pop the stashed changes.

I reran Xcode Profiler to see whether the memory leak is fixed. The result is shown as follows.

Xcode-Leaks-Fixed

It shows that the memory usage doesn’t increase infinitely now.

Remarks

There is another memory leak that is reported by Xcode Profiler, but it has not been fixed in my pull request. Can you find the memory leak and fix it?