I am using spaCy in order to match text against certain dependency patterns. I'm facing the problem that my DependencyParser gives different results even in simple sentences when a single word (of same ground-true POS-Tag) is changed. E.g. 'The baker and supervisor support the baking' finds that 'support' is a VERB, 'baker' and 'supervisor' are NOUNS. 'baker' and 'supervisor' have nsubj dependency to support, 'baking' is dobj of support. See here. Now changing this to 'The baker and oven support the baking' results in an ADV POS-Tag for 'oven' instead of NOUN and has dependency advmod to 'support'. See here. This makes absolutely no sense as oven is never an adverb. I thought that the DependencyParser probably uses the POS-Tags and that changing them could change the resulting dependencies.
I found this question [3] and managed to extract the probabilities of all POS-Tags for each token with Tagger.model.predict([doc]) which delivers a matrix of shape len(doc) x len(tagger.labels). In the first sentence 'supervisor' got 99,8 % for NOUN, while 'oven' only got 62 % for ADV. So I produced multiple docs for the same text, where I changed the Token.pos_ to the second and third most probable candidates if there is uncertainty (most probable tag < 90 %). I then ran the DependencyMatcher with all three docs, thinking that the different POS-Tags would lead to a change in dependencies, but it doesn't. When 'oven' has POS-Tag NOUN (third most probable tag) it still has advmod dependency with 'support', which doesn't make sense.
So similar to [3] I want to inspect the probabilities of all values for Token.dep_ for each token in the doc. Unfortunately, DependencyParser.model.predict([doc]) doesn't deliver this (afaik).
Is it possible to find uncertainties of spaCy token dependencies?
89 Views Asked by Lukas At
0
There are 0 best solutions below
Related Questions in PYTHON
- How to store a date/time in sqlite (or something similar to a date)
- Instagrapi recently showing HTTPError and UnknownError
- How to Retrieve Data from an MySQL Database and Display it in a GUI?
- How to create a regular expression to partition a string that terminates in either ": 45" or ",", without the ": "
- Python Geopandas unable to convert latitude longitude to points
- Influence of Unused FFN on Model Accuracy in PyTorch
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Writes to child subprocess.Popen.stdin don't work from within process group?
- Conda has two different python binarys (python and python3) with the same version for a single environment. Why?
- Problem with add new attribute in table with BOTO3 on python
- Can't install packages in python conda environment
- Setting diagonal of a matrix to zero
- List of numbers converted to list of strings to iterate over it. But receiving TypeError messages
- Basic Python Question: Shortening If Statements
- Python and regex, can't understand why some words are left out of the match
Related Questions in NLP
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Clarification on T5 Model Pre-training Objective and Denoising Process
- The training accuracy and the validation accuracy curves are almost parallel to each other. Is the model overfitting?
- Give Bert an input and ask him to predict. In this input, can Bert apply the first word prediction result to all subsequent predictions?
- Output of Cosine Similarity is not as expected
- Getting an error while using the open ai api to summarize news atricles
- SpanRuler on Retokenized tokens links back to original token text, not the token text with a split (space) introduced
- Should I use beam search on validation phase?
- Dialogflow failing to dectect the correct intent
- How to detect if two sentences are simmilar, not in meaning, but in syllables/words?
- Is BertForSequenceClassification using the CLS vector?
- Issue with memory when using spacy_universal_sentence_encoder for similarity detection
- Why does the Cloud Natural Language Model API return so many NULLs?
- Is there any OCR or technique that can recognize/identify radio buttons printed out in the form of pdf document?
- Model, lexicon to do fine grained emotions analysis on text in r
Related Questions in DEPENDENCIES
- I have hundreds of dependencies on my package.json file which I didn't install (npm and using Warp)
- Nest.js can't resolve dependencies of the external library's Reflector dependency
- c++ python ctypes dependency issues
- Why rebuild module does not recompile dependency module, but build module does in IntelliJ Idea?
- I need help to upgrade deprecated dependencies in an ASP.NET Core 8 Web API project
- libstdc++ dependency mismatch for applications
- Use Google Font Without Network Connection
- IServiceCollectionConfigurator' does not contain a definition for 'UsingRabbitMq'
- Understanding Modules, Dependencies, Libraries & Packages
- `go mod graph` doesn't seem to provide the full graph
- java.lang.NoSuchMethodError: org.glassfish.jersey.message.internal.HeaderUtils.createInbound()Ljakarta/ws/rs/core/AbstractMultivaluedMap;
- "Unable to generate SAFESEH image." but disabling SAFESEH breaks dependency links
- When or what makes gcc add dependencies?
- How can I change a dependencies for an installed Gem
- Java Maven Cannot Find Symbol on compile, but runs ok on debug
Related Questions in SPACY
- SpanRuler on Retokenized tokens links back to original token text, not the token text with a split (space) introduced
- Issue with memory when using spacy_universal_sentence_encoder for similarity detection
- Customized named entities is throwing vlaue error in spacy
- Cannot access terminal labels of Berkeley Neural Parser
- How to Make spelling correction for custom entity in Spacy
- Is there some way to efficiently annotate data for a custom spaCy NER model?
- Spacy matcher is not finding any matches for counties
- Loading a pre-trained spaCy transformer with Hugging Face fails because of missing config.json
- How to debugg a spacy weasel project executed from the terminal using VSCODE o Pycharm?. Process don't get attached
- Python spacy 2.3.5 installation error within the subprocesses
- Spacy EntityRuler - Tagging multiple labels on a single entity
- Can spaCy's dependency parser give grammatically incorrect parse trees?
- Can I monitor progress of spacy parsing?
- Generate TRAIN_DATA for spacy from xml
- Convert output of Berkeley Neural Parser to Chomsky Normal Form (binary branching tree)
Related Questions in PART-OF-SPEECH
- Text to Tag similarity word2vec
- Unable to do Pos_tag to extract
- Sentence ML/DL classification based on keywords or use NLP rule based approach?
- Does NLTK PoS tagging support all languages?
- How to solve "TypeError: list indices must be integers or slices, not str" with a list of dictionaries?
- How to define pos_pattern for extracting nouns followed by zero or more sequence of nouns or adjectives for KeyphraseCountVectorizer?
- How do I retrieve phrases from a NLTK.tree using custom node labels?
- Get all possible part-of-speech tags for a word Python
- Gender Detection for Nouns in Spanish
- Why does a pretrained spacy pipe not work when added to a spacy.blank pipe?
- How to remove POS-tag 'VERBS' from dataframe
- List of dependencies in Spacy
- Merge tokens based on preceeding POS tags
- How to return given word and dependency using spacy
- Mapping from Wiktionary part-of-speech tags to 12 universal part-of-speech tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?