CUDA issue with NER (Named Entity Recognition) for ML predictions

457 Views Asked by At

I'm attempting to use NamedEntityRecognition (NER)(https://github.com/dotnet/machinelearning/issues/630) to predict categories for words/phrases within a large body of text.

Currently using 3 Nuget packages to try get this working:

Microsoft.ML (3.0.0-preview.23511.1)

Microsoft.ML.TorchSharp (0.21.0-preview.23511.1)

Torchsharp-cpu (0.101.1)

At the point of training the model [estimator.Fit(dataView)], I get the following error:

Field not found: 'TorchSharp.torch.CUDA'.

I may have misunderstood something here, but I should be processing with CPU from the Torchsharp-cpu package and I'm not sure where the CUDA reference is coming from. This also appears to be a package reference rather than a field?

using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.TorchSharp;
using System;
using System.Collections.Generic;
using System.Windows.Forms;

namespace NerTester
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

    private class TestSingleSentenceData
    {
        public string Sentence;
        public string[] Label;
    }

    private class Label
    {
        public string Key { get; set; }
    }

    private void startButton_Click(object sender, EventArgs e)
        {
        try
        {
                var context = new MLContext();
                context.FallbackToCpu = true;
                context.GpuDeviceId = null;

                var labels = context.Data.LoadFromEnumerable(
                new[] {
                new Label { Key = "PERSON" },
                new Label { Key = "CITY" },
                new Label { Key = "COUNTRY"  }
                });

                var dataView = context.Data.LoadFromEnumerable(
                    new List<TestSingleSentenceData>(new TestSingleSentenceData[] {
                    new TestSingleSentenceData()
                    {   // Testing longer than 512 words.
                        Sentence = "Alice and Bob live in the USA",
                        Label = new string[]{"PERSON", "0", "PERSON", "0", "0", "0", "COUNTRY"}
                    },
                     new TestSingleSentenceData()
                     {
                        Sentence = "Alice and Bob live in the USA",
                        Label = new string[]{"PERSON", "0", "PERSON", "0", "0", "0", "COUNTRY"}
                     },
                    }));
                var chain = new EstimatorChain<ITransformer>();
                var estimator = chain.Append(context.Transforms.Conversion.MapValueToKey("Label", keyData: labels))
                   .Append(context.MulticlassClassification.Trainers.NameEntityRecognition(outputColumnName: "outputColumn"))
                   .Append(context.Transforms.Conversion.MapKeyToValue("outputColumn"));

                var transformer = estimator.Fit(dataView);
                transformer.Dispose();
                
                MessageBox.Show("Success!");
            }
        catch (Exception ex)
            {
        MessageBox.Show($"Error: {ex.Message}");
            }
    }
    }
}

Application is running on x64 and the documentation for NER appears to be limited.

Any help would be greatly appreciated.

Tried changing the Nuget packages I'm referencing, including the use if libtorch packages.

Attempted running the application in x86 and x64 configuration.

Added code to try force CPU usage rather than GPU (CUDA).

2

There are 2 best solutions below

0
On

You will only need to reference 2 packages for that experiment

<ItemGroup>
   <PackageReference Include="Microsoft.ML.TorchSharp" Version="0.21.0-preview.23511.1" />
   <PackageReference Include="libtorch-cpu-<your-platform>" Version="2.1.0.1" />
</ItemGroup>

As Microsoft.ML.TorchSharp contains all the references you will need:

dependencies

Now the bad news.
At runtime you will get a bunch of errors related to missing files or dlls. I spent a good amount of time trying to figure out what I was missing but, I guess, it is just related to the versions of some libraries.

At the end I cloned the whole repo and compiled for my platform (Win-x64) and tried to find the files with different sizes (some don't have a version so I the size was the oonly option) and it boils down to 7 libs:

enter image description here

Those brought in by the compilation are all there ... just not the ones the ML.NET expects:

library brought in by the compiler

I replaced the dlls with the ones from the ML.NET repo, copied them in the folder \bin\Debug\net7.0\runtimes\win-x64\native and everything works fine:

enter image description here

Maybe there is a smarter solution but I couldn't find any.

UPDATE:

As anrouxel suggested on Github the best way is to use libtorch-cpu version 1.13.0.1:

enter image description here

0
On

@lahbton and @Leftyx,

I've managed to get your example to work, but I've just turned it into a console. The problem came from the version that the "libtorch-cpu-win-x64" or whatever you were using. Microsoft.ML 3.0.0-preview.23511.1 and Microsoft.ML.TorchSharp 0.21.0-preview.23511.1 use the version "libtorch-cpu-win-x64" or other 1.13.0.1.

test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj

  <ItemGroup Condition="'$(TargetArchitecture)' == 'x64'">
    <PackageReference Include="libtorch-cpu-win-x64" Version="$(LibTorchVersion)" Condition="$([MSBuild]::IsOSPlatform('Windows')) AND '$(TargetArchitecture)' == 'x64'" />
      <!-- <PackageReference Include="TorchSharp-cuda-windows" Version="0.99.5" Condition="$([MSBuild]::IsOSPlatform('Windows'))" />   -->
    <PackageReference Include="libtorch-cpu-linux-x64" Version="$(LibTorchVersion)" Condition="$([MSBuild]::IsOSPlatform('Linux')) AND '$(TargetArchitecture)' == 'x64'" />
    <PackageReference Include="libtorch-cpu-osx-x64" Version="$(LibTorchVersion)" Condition="$([MSBuild]::IsOSPlatform('OSX')) AND '$(TargetArchitecture)' == 'x64'" />
  </ItemGroup>

eng/Versions.props

<LibTorchVersion>1.13.0.1</LibTorchVersion>

Here is my code : Program.cs

using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.TorchSharp;

public class Program
{
    // Main method
    public static void Main(string[] args)
    {
        try
        {
            var context = new MLContext();
            context.FallbackToCpu = true;
            context.GpuDeviceId = null;

            var labels = context.Data.LoadFromEnumerable(
            new[] {
                new Label { Key = "PERSON" },
                new Label { Key = "CITY" },
                new Label { Key = "COUNTRY"  }
            });

            var dataView = context.Data.LoadFromEnumerable(
                new List<TestSingleSentenceData>(new TestSingleSentenceData[] {
                    new TestSingleSentenceData()
                    {   // Testing longer than 512 words.
                        Sentence = "Alice and Bob live in the USA",
                        Label = new string[]{"PERSON", "0", "PERSON", "0", "0", "0", "COUNTRY"}
                    },
                     new TestSingleSentenceData()
                     {
                        Sentence = "Alice and Bob live in the USA",
                        Label = new string[]{"PERSON", "0", "PERSON", "0", "0", "0", "COUNTRY"}
                     },
                }));
            var chain = new EstimatorChain<ITransformer>();
            var estimator = chain.Append(context.Transforms.Conversion.MapValueToKey("Label", keyData: labels))
               .Append(context.MulticlassClassification.Trainers.NameEntityRecognition(outputColumnName: "outputColumn"))
               .Append(context.Transforms.Conversion.MapKeyToValue("outputColumn"));

            var transformer = estimator.Fit(dataView);
            transformer.Dispose();

            Console.WriteLine("Success!");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error: {ex.Message}");
        }
    }

    private class Label
    {
        public string Key { get; set; }
    }

    private class TestSingleSentenceData
    {
        public string Sentence;
        public string[] Label;
    }
}

ConsoleApp1.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net7.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
      <PackageReference Include="libtorch-cpu-win-x64" Version="1.13.0.1" />
      <PackageReference Include="Microsoft.ML" Version="3.0.0-preview.23511.1" />
      <PackageReference Include="Microsoft.ML.TorchSharp" Version="0.21.0-preview.23511.1" />
  </ItemGroup>

</Project>

https://github.com/dotnet/machinelearning/issues/630#issuecomment-1806550435