How to delete hmtl tags using python

45 Views Asked by At

I am looking for a method to delete the records held in 'content' tag on my ADO wikipage.

I've checked the json content of what my url page holds:

def wiki_update(vm_list):
     pyautogui.hotkey('f5')
     pat = get_or_set_secret(kv_url, pat_secret_name, " ")
     authorization = base64.b64encode(f":{pat}".encode('ascii')).decode('ascii')
     basic_auth_value = f"Basic {authorization}"
     get_headers = {"Authorization": basic_auth_value}
    
     get_response = requests.get(wiki_url, verify=True, headers=get_headers)
     content = json.loads(get_response.text)
     print(content.content)

The output is:

{'path': '/MyWikiPage', 'order': 1, 'gitItemPath': '/Test%2DDebugVM%2DRemove%2DContent.md', 'subPages': [], 'url': 'https://dev.azure.com/xxx/123456789a6ba0154ffce/_apis/wiki/wikis/12345981-0649bdzsbf7aad/pages/%2FTest-DebugVM-Remove-Content', 'remoteUrl': 'https://dev.azure.com/xxx/123456789a6ba0154ffce/_apis/wiki/wikis/12345981-0649bdzsbf7aad?pagePath=%2FMyWikiPage', 'id': 14523, 'content': '\n|virtual machine name | ADO ticket | owner Email | creation date |\n|--|--|--|--|\n| my-12345-vm-0 | #12345 | [email protected] | 29-01-2024 |'}

The specific tag I want to remove the items inside is 'content':

'content': '\n|virtual machine name | ADO ticket | owner Email | creation date |\n|--|--|--|--|\n| my-12345-vm-0 | #12345 | [email protected] | 29-01-2024 |'

I do want to keep the headers, so the items I need help removing are:

my-12345-vm-0 | #12345 | [email protected] | 29-01-2024 |

Here the the HTML output from the wikipage:

<script id="dataProviders" type="application/json">{"data":{".... ,"content":"|virtual machine name | ADO ticket | owner Email | creation date |\n|--|--|--|--|\n| my-12345-vm-0 | #12345 | [email protected] | 29-01-2024 |"},
0

There are 0 best solutions below