Tensorboard profile : No profile data was found

149 Views Asked by At

Problem

I dont see anything in the profile section in Tensorboard. I get the following Tensorboard interface after running tensorboard --logdir logdir

enter image description here

The tree of logdir is as follow:

logdir
├── events.out.tfevents.17026478. gpu.profile-empty
└── plugins
    └── profile
        ├── 2023_12_15_12_41_18
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        ├── 2023_12_15_12_41_21
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        ├── 2023_12_15_12_41_22
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        ├── 2023_12_15_12_41_23
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        ├── 2023_12_15_12_41_24
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        ├── 2023_12_15_12_41_25
        │   ├──  gpu.input_pipeline.pb
        │   ├──  gpu.kernel_stats.pb
        │   ├──  gpu.memory_profile.json.gz
        │   ├──  gpu.overview_page.pb
        │   ├──  gpu.tensorflow_stats.pb
        │   ├──  gpu.trace.json.gz
        │   └──  gpu.xplane.pb
        └── 2023_12_15_12_41_26
            ├──  gpu.input_pipeline.pb
            ├──  gpu.kernel_stats.pb
            ├──  gpu.memory_profile.json.gz
            ├──  gpu.overview_page.pb
            ├──  gpu.tensorflow_stats.pb
            ├──  gpu.trace.json.gz
            └──  gpu.xplane.pb

9 directories, 50 files

Code to generate logdir

This is a simple training loop using the tf.profiler.experimental.Profile API inspired from this TF tutorial :

for epoch in range(1, epochs+1):                                                   
    
    if dataset_exists is True:
        #with tf.profiler.experimental.Trace('train', step_num=epoch, _r=1):       
        with tf.profiler.experimental.Profile("logdir"):
            loss_train = model.training_step(dataset, optimizer)                   
    else:
        loss_train = training_step(model._model, X_train, Y_train, optimizer)    

Additional information

I am running the code on a cluster in order to use a GPU. Then I use scp to copy the logdir folder from the cluster to my personal laptop.

Output of the command tensorboard --logdir logdir --inspect:

======================================================================
Processing event files... (this can take a few minutes)
======================================================================

Found event files in:
logdir

These tags are in logdir:
audio -
histograms -
images -
scalars -
tensor -
======================================================================

Event statistics for logdir:
audio -
graph -
histograms -
images -
scalars -
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor -
=====================================================================

I can add the warnings in the terminal if necessary.

Versions

  • tensorboard==2.8.0
  • tensorboard-data-server==0.6.1
  • tensorboard-plugin-profile==2.15.0
  • tensorboard-plugin-wit==1.8.1
  • tensorflow==2.8.0
  • tensorflow-io-gcs-filesystem==0.34.0

Questions/Remarks

  • By looking at Tensorboard releases, a user is supposed to use the same versions of Tensorflow and Tensorboard, since "Tensorboard version X tracks Tensorflow version X"
  • For those who manage to use Tensorboard with Tensorflow, which versions of Tensorflow and Tensorboard did you use?
0

There are 0 best solutions below