I'm unable to get datalad working in python jupyter lab notebook, but it works fine in the regular terminal. Is there something I need to do specifically to integrate datalad with Jupyter notebooks? I installed based on the Datalad handbook: http://handbook.datalad.org/en/latest/intro/installation.html#install . Here are some specifics:
Machine specifications: macOS Mojave 10.14.6, Python 3.8.5, using anaconda
It wouldn't work to use the ! to install using shell terminal, but did work in the terminal app. Here's a jupyter terminal example when I try to use data lad: !datalad status --annex all; yields this error: [ERROR ] git-annex of version >= 7.20190503 is missing. Visit http://handbook.datalad.org/r.html?install for instructions on how to install DataLad and git-annex. [annexrepo.py:_check_git_annex_version:555] (MissingExternalDependency)
I tried to pip install in jupyter (pip install datalad) and it gave me this warning but otherwise seemed to go ok: WARNING: The directory '/Users/eprzysinda/Library/Caches/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
When I try to import datalad.api I get a runtime error that's very long but starts with this:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-41-e10c67ed4457> in <module>
1 import os
2 import glob
----> 3 import datalad.api as dl
4 #import pandas as pd
5
/opt/anaconda3/lib/python3.8/site-packages/datalad/__init__.py in <module>
46
47 from .config import ConfigManager
---> 48 cfg = ConfigManager()
49
50 from .log import lgr
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in __init__(self, dataset, overrides, source)
344 self._runner = GitWitlessRunner(**run_kwargs)
345
--> 346 self.reload(force=True)
347
348 if not ConfigManager._checked_git_identity:
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in reload(self, force)
397 while to_run:
398 store_id, runargs = to_run.popitem()
--> 399 self._stores[store_id] = self._reload(runargs)
400
401 # always update the merged representation, even if we did not reload
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in _reload(self, run_args)
425 def _reload(self, run_args):
426 # query git-config
--> 427 stdout, stderr = self._run(
428 run_args,
429 protocol=StdOutErrCapture,
/opt/anaconda3/lib/python3.8/site-packages/datalad/config.py in _run(self, args, where, reload, **kwargs)
787 if '-l' in args:
788 # we are just reading, no need to reload, no need to lock
--> 789 out = self._runner.run(self._config_cmd + args, **kwargs)
790 return out['stdout'], out['stderr']
791
/opt/anaconda3/lib/python3.8/site-packages/datalad/cmd.py in run(self, cmd, protocol, stdin, cwd, env, **kwargs)
377 lgr.debug('Async run:\n cwd=%s\n cmd=%s', cwd, cmd)
378 # include the subprocess manager in the asyncio event loop
--> 379 results = event_loop.run_until_complete(
380 run_async_cmd(
381 event_loop,
/opt/anaconda3/lib/python3.8/asyncio/base_events.py in run_until_complete(self, future)
590 """
591 self._check_closed()
--> 592 self._check_running()
593
594 new_task = not futures.isfuture(future)
/opt/anaconda3/lib/python3.8/asyncio/base_events.py in _check_running(self)
550 def _check_running(self):
551 if self.is_running():
--> 552 raise RuntimeError('This event loop is already running')
553 if events._get_running_loop() is not None:
554 raise RuntimeError(
RuntimeError: This event loop is already running
Let me know if anyone has any ideas on this. I'm new to python and jupyter lab, so it's very possible I'm missing something obvious.
Thank you!
~Emily
I've posted a solution to this issue at the Jupyter Discourse Forum here that involves importing
nest_asyncioand applying it before importing datalad, based on the suggestion here.In the future please link to your posts if you are going to ask multiple communities for help. Having multiple groups work at your problem just divides resources and potentially doubles efforts for everyone involved. It also potentially fragments the path to finding a solution for those who have the same issue because they don't realize there may be an answer somewhere else.