I have a bunch of keywords stored in a 620x2 pandas dataframe seen below. I think I need to treat each entry as its own set, where semicolons separate elements. So, we end up with 1240 sets. Then I'd like to be able to search how many times keywords of my choosing appear together. For example, I'd like to figure out how many times 'computation theory' and 'critical infrastructure' appear together as a subset in these sets, in any order. Is there any straightforward way I can do this?

How Do I Count The Number of Times a Subset of Words Appear In My Pandas Dataframe?
77 Views Asked by TheMaffGuy At
2
There are 2 best solutions below
1
bui
On
Not sure if this is considered straightforward, but it works. keyword_list is the list of paired keywords you want to search.
df['Author Keywords'] = df['Author Keywords'].fillna('').str.split(';\s*').apply(set)
df['Index Keywords'] = df['Index Keywords'].fillna('').str.split(';\s*').apply(set)
df.apply(lambda x : x.apply(lambda y : all([kw in y for kw in keyword_list]))).sum().sum()
Related Questions in PYTHON
- How to store a date/time in sqlite (or something similar to a date)
- Instagrapi recently showing HTTPError and UnknownError
- How to Retrieve Data from an MySQL Database and Display it in a GUI?
- How to create a regular expression to partition a string that terminates in either ": 45" or ",", without the ": "
- Python Geopandas unable to convert latitude longitude to points
- Influence of Unused FFN on Model Accuracy in PyTorch
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Writes to child subprocess.Popen.stdin don't work from within process group?
- Conda has two different python binarys (python and python3) with the same version for a single environment. Why?
- Problem with add new attribute in table with BOTO3 on python
- Can't install packages in python conda environment
- Setting diagonal of a matrix to zero
- List of numbers converted to list of strings to iterate over it. But receiving TypeError messages
- Basic Python Question: Shortening If Statements
- Python and regex, can't understand why some words are left out of the match
Related Questions in PANDAS
- ModuleNotFoundError on .ipynb
- Str object is not callable in pandas
- Need help realigning python fill_between with data points
- AttributeError: module 'numba' has no attribute 'generated_jit'
- Fix error when assigning a list of values to dataframe row
- How to make pandas show large datasets in output?
- merge dataframe but do not sort by merge key
- vim python omnifunc not working some modules
- Preserving DataFrame Modifications Across Options in a Streamlit Application
- How to join 2 datasets by looking up based on a string (full match or part match)
- Python Pandas getting hierarchy path till top management
- How to convert pandas series to integer for use in datetime.fromisocalendar
- reformat numbers stored in array
- How can I resolve this error and work smoothly in deep learning?
- What is the best way to merge two dataframes that one of them has date ranges and the other one has date WITHOUT any shared columns?
Related Questions in DATAFRAME
- Preserving DataFrame Modifications Across Options in a Streamlit Application
- Python Pandas getting hierarchy path till top management
- What is the best way to merge two dataframes that one of them has date ranges and the other one has date WITHOUT any shared columns?
- python pandas plot.bar something wrong
- Subsetting rows with sequence of values and identifying columns where sequence begins
- How to group rows by values to create new columns in Pandas DataFrame?
- How to write an R function to pivot the last n minutes?
- How can I change the groupby scope to find the first value that meets the conditions of a mask?
- Eliminate sub elements in a huge list of strings as long as no duplicates appear
- How to transfer object dataframe in sklearn.ensemble methods
- How can i fix this error ? Attempt to get argmax of an empty sequence
- How can I change the groupby column to find the first row that meets the conditions of a mask if the initial groupby failed to find it?
- How to iteratively create matrices/vectors from columns/unique row values of dataframe, and pass them to subsequent code?
- How to convert scraped HTML document to a dataframe?
- Replacing values on a dataframe row using a specific value as reference
Related Questions in COUNTING
- Aggregate discrete frequency table in smaller table with fewer intervals
- Word Count in C
- Counting consecutive columns that satisfy a condition in R
- Counting the number of indicators that are active
- Counting numbers greater and less than input using 4 variables
- Group rows of a 2d array by year-month value of date column and count occurrences in each group
- Counting loops in Python when skipping numbers
- Number of the binary strings that have the given number of occurrences
- EXCEL Deadline-tracker: Counting "distance" of empty cells (in a row) until you reach an event
- EXCEL- Trying to count specific dates between two but also in specicif Years. Excluding the days before or after the dates asked
- About CountIF Function
- Win condition in game not working. Trouble finding subsequent elements in 2D Array
- Counting the number of polygons containing origin in 2D
- ProjectEuler Problem 17: I change this one line of code, and my answer changes drastically, even though I think it should work fine
- Counting Different cells that are the same for a list
Related Questions in SUBSET-SUM
- Combing Subsets into Larger Sets
- Counting the Number of Combinations that Match a Certain Condition
- all possible distinct non decreasing sequences(combinations) of numbers to reach the given sum with quick performance
- Number of subsets with zero sum - interpretation of the result
- Largest possible subset challenge failing
- Is it possible to transform a whole number array (positive and negative) and a sum of a subset to a natural number (positive only) array and sum
- What is the runtime of this subset sum algorithm?
- Algorithm to recover a set given the sums of all its subsets
- Efficient way of approaching the Subset Sum Problem with very large input sets
- Need optimization tips for a subset sum like problem with a big constraint
- Find all combinations that add up to given number python with list of lists
- Subset sum problem with known subset size and array being a range
- How Do I Count The Number of Times a Subset of Words Appear In My Pandas Dataframe?
- Efficient way to solve subset sum variation
- Finding all possible unique combinations of numbers to reach a given sum
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Use
.locto find if the keywords appear together.Do this after you have split the data into 1240 sets. I don't understand whether you want to make new columns or just want to keep the columns as is.
.locselects the conditional data frame. You can usesubset_df=df[df['column_name'].str.contains('string')]if you have only one condition.To the column split or any other processing before you make the
filtersor run the filters again after processing.