distinct去重
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext...sparkSession.sparkContext());
List data = Arrays.asList(1, 1, 2, 3, 4, 5);
JavaRDD...results);
}
}
结果是[4, 1, 3, 5, 2]
union合并,不去重
这个就是简单的将两个RDD合并到一起
import org.apache.spark.api.java.JavaRDD...one = Arrays.asList(1, 2, 3, 4, 5);
List two = Arrays.asList(1, 6, 7, 8, 9);
JavaRDD...results);
}
}
结果是[1, 2, 3, 4, 5, 1, 6, 7, 8, 9]
intersection取交集
import org.apache.spark.api.java.JavaRDD