我有一个胶水作业,它从一个s3桶中获取一个csv文件,并将数据导入postgres表。它通过jdbc连接连接到数据库。正在导入字符串/varchar列,但不导入数字列。
下面是postgres rds列类型:
下面是python胶水脚本:
def __step_mapping_columns(self):
# Script generated for node S3 bucket
dynamicFrame_dept_summary = self.glueContext.create_dynamic_frame.from_options(
format_options={"quoteChar": '"', "withHeader": True, "separator": ","},
connection_type="s3",
format="csv",
connection_options={
"paths": [
""
],
"recurse": True,
},
transformation_ctx="dynamicFrame_dept_summary",
)
# Script generated for node ApplyMapping
applyMapping_dept_summary = ApplyMapping.apply(
frame=dynamicFrame_dept_summary,
mappings=[("PROCESS_MAIN", "string", "process_main", "string"),
("PROCESS_CORE", "string", "process_core", "string"),
("DC", "string", "dc", "string"),
("BAG_SIZE", "string", "bag_size", "string"),
("EVENT_30_LOC", "string", "start_time_utc", "string"),
("VOLUME", "long", "box_volume", "long"),
("MINUTES", "long", "minutes", "long"),
("PLAN_MINUTES", "long", "plan_minutes", "long"),
("PLAN_RATE", "long", "plan_rate", "long")],
transformation_ctx="applyMapping_dept_summary",
)
logger.info(mappings)
return applyMapping_dept_summary
有人知道问题可能是什么吗?
发布于 2022-12-02 13:49:10
弄明白了。我需要先将这些列键入为长类型,因为动态框架不确定数据类型。
dynamicFrame_dept_summary = dynamicFrame_dept_summary.resolveChoice( specs =(“音量”,“cast:long”)).resolveChoice( specs =(‘PLAN’,'cast:long')).resolveChoice( specs =(‘PLAN_PLAN’,‘cast:long’).resolveChoice( specs =(计划_速率,'cast:long'))
https://stackoverflow.com/questions/74659315
复制