我有两个数据集:
implicit val spark: SparkSession = SparkSession
.builder()
.appName("app").master("local[1]")
.config("spark.executor.memory", "1g")
.getOrCreate()
import spark.implicits._
val ds1 = /*read csv file*/.as[caseClass1]
val ds2 = /*read csv file*/.as[caseClass2]
然后我就加入到地图上来:
val ds3 = ds1.
joinWith(ds2, ds1("id") === ds2("id"))
.map{case(left, right) => (left, Option(right))}
得到预期的结果。
问题是,我正在尝试使用该函数和其他一些函数来实现RichDataset,如下所示:
object Extentions {
implicit class RichDataset[T <: Product](leftDs: Dataset[T]) {
def leftJoinWith[V <: Product](rightDs: Dataset[V], condition:
Column)(implicit spark: SparkSession) : Dataset[(T, Option[V])] = {
import spark.implicits._
leftDs.joinWith(rightDs, condition, "left")
.map{case(left, right) => (left, Option(right))}
}
}
}
主要而言,导入Extentions._对leftJoinWith的调用失败:
Error:(15, 13) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. .map{case(left, right) => (left, Option(right))}
Error:(15, 13) not enough arguments for method map: (implicit evidence$6: org.apache.spark.sql.Encoder[(T, Option[V])])org.apache.spark.sql.Dataset[(T, Option[V])]. Unspecified value parameter evidence$6. .map{case(left, right) => (left, Option(right))}
..。但是spark.implicits._是在函数中导入的!
如果只返回join,而不是join + map,那么它将在main和函数中同时工作。
scalaVersion := "2.11.8",sparkVersion := "2.2.0“
提前感谢!
发布于 2017-11-05 13:43:54
如果将TypeTag
添加到泛型类型参数中,它可以工作(在Spark的源代码中可以看到这一点):
import scala.reflect.runtime.universe.TypeTag
import org.apache.spark.sql.{Column, Dataset, SparkSession}
object Extentions {
implicit class RichDataset[T <: Product : TypeTag](leftDs: Dataset[T]) {
def leftJoinWith[V <: Product : TypeTag](rightDs: Dataset[V], condition:
Column)(implicit spark: SparkSession) : Dataset[(T, Option[V])] = {
import spark.implicits._
leftDs.joinWith(rightDs, condition, "left")
.map{case(left, right) => (left, Option(right))}
}
}
}
https://stackoverflow.com/questions/47120041
复制相似问题