Problems getting datalad to work within juypter lab notebook?

Question

Problems getting datalad to work within juypter lab notebook?

425 Views Asked by Emily Przysinda At 12 May 2021 at 19:10

I'm unable to get datalad working in python jupyter lab notebook, but it works fine in the regular terminal. Is there something I need to do specifically to integrate datalad with Jupyter notebooks? I installed based on the Datalad handbook: http://handbook.datalad.org/en/latest/intro/installation.html#install . Here are some specifics:

Machine specifications: macOS Mojave 10.14.6, Python 3.8.5, using anaconda

It wouldn't work to use the ! to install using shell terminal, but did work in the terminal app. Here's a jupyter terminal example when I try to use data lad: !datalad status --annex all; yields this error: [ERROR ] git-annex of version >= 7.20190503 is missing. Visit http://handbook.datalad.org/r.html?install for instructions on how to install DataLad and git-annex. [annexrepo.py:_check_git_annex_version:555] (MissingExternalDependency)

I tried to pip install in jupyter (pip install datalad) and it gave me this warning but otherwise seemed to go ok: WARNING: The directory '/Users/eprzysinda/Library/Caches/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

When I try to import datalad.api I get a runtime error that's very long but starts with this:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-41-e10c67ed4457> in <module>
      1 import os
      2 import glob
----> 3 import datalad.api as dl
      4 #import pandas as pd
      5 
/opt/anaconda3/lib/python3.8/site-packages/datalad/__init__.py in <module>
     46 
     47 from .config import ConfigManager
---> 48 cfg = ConfigManager()
     49 
     50 from .log import lgr
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in __init__(self, dataset, overrides, source)
    344             self._runner = GitWitlessRunner(**run_kwargs)
    345 
--> 346         self.reload(force=True)
    347 
    348         if not ConfigManager._checked_git_identity:
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in reload(self, force)
    397         while to_run:
    398             store_id, runargs = to_run.popitem()
--> 399             self._stores[store_id] = self._reload(runargs)
    400 
    401         # always update the merged representation, even if we did not reload
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in _reload(self, run_args)
    425     def _reload(self, run_args):
    426         # query git-config
--> 427         stdout, stderr = self._run(
    428             run_args,
    429             protocol=StdOutErrCapture,
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in _run(self, args, where, reload, **kwargs)
    787         if '-l' in args:
    788             # we are just reading, no need to reload, no need to lock
--> 789             out = self._runner.run(self._config_cmd + args, **kwargs)
    790             return out['stdout'], out['stderr']
    791 
/opt/anaconda3/lib/python3.8/site-packages/datalad/cmd.py in run(self, cmd, protocol, stdin, cwd, env, **kwargs)
    377             lgr.debug('Async run:\n cwd=%s\n cmd=%s', cwd, cmd)
    378             # include the subprocess manager in the asyncio event loop
--> 379             results = event_loop.run_until_complete(
    380                 run_async_cmd(
    381                     event_loop,
/opt/anaconda3/lib/python3.8/asyncio/base_events.py in run_until_complete(self, future)
    590         """
    591         self._check_closed()
--> 592         self._check_running()
    593 
    594         new_task = not futures.isfuture(future)
/opt/anaconda3/lib/python3.8/asyncio/base_events.py in _check_running(self)
    550     def _check_running(self):
    551         if self.is_running():
--> 552             raise RuntimeError('This event loop is already running')
    553         if events._get_running_loop() is not None:
    554             raise RuntimeError(
RuntimeError: This event loop is already running

Let me know if anyone has any ideas on this. I'm new to python and jupyter lab, so it's very possible I'm missing something obvious.
Thank you!
~Emily

Original Q&A

There are 2 best solutions below

**Wayne** · Answer 1 · 2021-05-17T16:23:59.580000

I've posted a solution to this issue at the Jupyter Discourse Forum here that involves importing nest_asyncio and applying it before importing datalad, based on the suggestion here.

In the future please link to your posts if you are going to ask multiple communities for help. Having multiple groups work at your problem just divides resources and potentially doubles efforts for everyone involved. It also potentially fragments the path to finding a solution for those who have the same issue because they don't realize there may be an answer somewhere else.

**Yaroslav Halchenko** · Answer 2 · 2021-06-15T01:49:56.157000

Yaroslav Halchenko On 15 June 2021 at 01:49

Fwiw: Current master branch of Datalad (to be released eventually as 0.15.0) had replaced asyncio implantation of the runner and must work within jupyterlab etc without any workarounds necessary.

Problems getting datalad to work within juypter lab notebook?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in JUPYTER-NOTEBOOK

Related Questions in JUPYTER-LAB

Related Questions in DATALAD

Trending Questions

Popular # Hahtags

Popular Questions