I have two sheets with thousands of records. There are overlaps between the two, in that each record on the two docs use identifiers. I need to migrate the records of what doesn't already exist into the other doc. How would I do that?
How to add data that doesn't exist on one list to another?
80 Views Asked by xoxoxox At
1
There are 1 best solutions below
Related Questions in OPENREFINE
- OpenRefine Remove Accents
- Using GREL function uniques() to remove duplicates
- Looking for a expression that show me true for keywords inside a value
- OpenRefine - invalid JSON?
- How to merge duplicate rows when with same ID with different data
- How do I convert date strings such as "1 September 1899" to "1899-09-01"
- Getting average of two or more values in OpenRefine
- OPenRefine GREL removing words if present in another column
- How to convert the nominal data from csv to arff format?
- Why are the spaces in the RDF syntax appearing as %20 (Ontotext Refine and GraphDB)?
- OpenRefine: Extract substring of n sentences before & after a keyword using GREL
- OpenRefine: How to delete content of cells if it matches specific string pattern?
- How to automatically merge selected data and re-cluster in Openrefine?
- Openrefine - Extract the instance of
- How to delete or merge near-duplicate names using OpenRefine?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
So this is possible, but if you are familiar with scripting languages like Python or R I currently would recommend using them to perform the merge.
Nevertheless here is a rudimentary recipe to perform the merge using only OpenRefine.
Assuming you have two projects called A and B and want to merge into Project B every row from Project A that is not already there yet. The two projects share a common id column.
1. Prepare Project B for synchronization
In Project A mark the rows that are already in Project B using cross. For that add a new column named "Sync" in Project A based on your ID column using the following GREL expression:
This will use the index of the row as temporary id for synchronization, but only for rows that are not already in Project B.
2. Prepare Project B for synchronization
In Project B we also add a new column named "Sync" using the following GREL expression:
This will add a string
,0,1,2,...,6000in the last row of column "Sync" in Project B. Note that you manually have to determine and set the two variablesrowsInProjectA(currently 6000) androwsInProjectB(currently 7000).Then we use Split multi-valued cells on column "Sync" in Project B using the comma
,as separator. This will basically add new rows to Project B containing only a value in the column "Sync" to be able to load the missing rows from Project A.3. Load rows from Project A
In Project B we use cross again to to load the missing rows from Project A. For that we use the transformation dialog in the ALL column to be able to load several columns in one step.
This GREL expression is assuming that the columns in Project A and Project B have the same names. Otherwise you would have to use the transform dialog on each column separately and manually map the column names from Project A and Project B.
4. Clean up