我正在尝试使用Spark/Python收集Azure Eventhub消息。每次,我都会得到异常"StreamingQueryException:输入字节数组有错误的4字节结束单元“
有什么想法吗?
conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://XXXXXXXXX.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXX=;EntityPath=XXXXXX"
read_df = spark.readStream.format("eventhubs").options(**conf).load()
stream = read_df.writeStream.format("console").start()
stream.awaitTermination()
发布于 2020-10-26 22:16:31
需要注意的是,2.3.15以上版本需要对配置字典中的连接字符串进行加密:
ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
https://stackoverflow.com/questions/64491504
复制