I have an existing AWS glue crawler with a glue connector to a MySQL database that runs successfully. I need to move it to glue v3 so that it uses an updated MySQL JDBC driver (Glue 2.0 jobs use MySQL JDBC driver version 5.1 but AWS Glue 3.0 use MySQL JDBC driver 8.0.23). The crawler is created/updated with boto3's glue_client.update_crawler. The crawler is set to use a JDBC glue connector that is also created with boto3 and also does not have a glue_version parameter.
The documentation on boto3's glue client crawler functions, https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.update_crawler, does not include an option for GlueVersion. I don't see any relevant options in the console either. The crawler configuration has a version but I don't think it's the glue version and it errors out when I set it to 3.0. I'm not sure if there is maybe a default setting for glue version somewhere that crawlers use?
Currently I am using:
glue_client = boto3.client('glue',region_name=region)
configuration= {"Version": 1.0,"Grouping": {"TableGroupingPolicy": "CombineCompatibleSchemas" }}
response = glue_client.update_crawler(
Name= crawler_name,
Role= glue_role_arn,
DatabaseName=str(crawler_details['DatabaseName']) + '-' + str(env_suffix),
Description=crawler_details['description'],
Targets=targets,
TablePrefix=crawler_details['TablePrefix'],
Schedule=crawler_details['Schedule'],
SchemaChangePolicy= crawler_details['SchemaChangePolicy'],
Configuration=configuration
)
How do I set a glue crawler to use GlueVersion = 3.0 using boto3?
A Glue Crawler does not have a version, Glue Jobs have. You need to select the correct connection in the
targetproperty, so that you are able to connect to a newer version.