How to access element within a structured json-ld or microdata list?

257 Views Asked by At

I am trying to access items within a json-ld list/dictionary via Python.

If possible, I would like to know if there is a product and if so, what is the URL, price and availability auf this product.

The info is right there in the metadata. How can this be accessed?

data = {'json-ld': [{'@context': 'http://schema.org',
          '@type': 'WebSite',
          'potentialAction': {'@type': 'SearchAction',
                              'query-input': 'required '
                                             'name=search_term_string',
                              'target': 'https://www.vitenda.de/search/result?term={search_term_string}'},
          'url': 'https://www.vitenda.de'},
         {'@context': 'http://schema.org',
          '@type': 'Organization',
          'logo': 'https://www.vitenda.de/documents/logo/logo_vitenda_02_646.png',
          'url': 'https://www.vitenda.de'},
         {'@context': 'http://schema.org/',
          '@type': 'BreadcrumbList',
          'itemListElement': [{'@type': 'ListItem',
                               'item': {'@id': 'https://www.vitenda.de/search',
                                        'name': 'Artikelsuche'},
                               'position': 1},
                              {'@type': 'ListItem',
                               'item': {'@id': '',
                                        'name': 'Ihre Suchergebnisse für '
                                                "<b>'11287708'</b> (1 "
                                                'Produkte)'},
                               'position': 2}]},
         {'@context': 'http://schema.org/',
          '@type': 'Product',
          'brand': {'@type': 'Organization', 'name': 'ALIUD Pharma GmbH'},
          'description': '',
          'gtin': '',
          'image': 'https://cdn1.apopixx.de/300/web_schraeg_png/11287708.png?ver=1649058520',
          'itemCondition': 'https://schema.org/NewCondition',
          'name': 'GINKGO AL 240 mg Filmtabletten',
          'offers': {'@type': 'Offer',
                     'availability': 'http://schema.org/InStock',
                     'deliveryLeadTime': {'@type': 'QuantitativeValue',
                                          'minValue': '3'},
                     'price': 96.36,
                     'priceCurrency': 'EUR',
                     'priceValidUntil': '19-06-2022 18:41:54',
                     'url': 'https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708'},
          'productID': '11287708',
          'sku': '11287708',
          'url': 'https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708'}]}



if '@context' in data['json-ld'][0]['@context']:
    print('yes')
else:
    print('no')

print(data['json-ld'][3])
1

There are 1 best solutions below

0
asdf On

It seems like products have a @type key with a value of Product, so if we filter these dictionaries out and iterate over them, we can accomplish what you want:

products = list(filter(lambda d: d.get('@type') == 'Product', data['json-ld']))
print(f'Found {len(products)} product{"s" if len(products) != 1 else ""}:')

for product in products:
    name = product['name']
    offers = product.get('offers', {})
    available = 'InStock' in offers.get('availability', '')
    price = f'{offers["price"]:.2f} {offers["priceCurrency"]}' if available else 'not available'
    url = product['url']
    print(f'{name} ({price}), {url}')

if not products:
    print('No products found')

Output:

Found 1 product:
GINKGO AL 240 mg Filmtabletten (96.36 EUR), https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708