我在斯卡拉使用星火已经很久了。现在我第一次使用火花放电。这是在Mac上
首先,我使用安装了火花放电,然后它安装了pyspark 2.2.0。
我使用brew安装apache-spark安装了spark本身,它似乎已经安装了apache 2.2.0
但当我运行火星雨时,它就会喷出。
/Users/me/anaconda/bin/pyspark: line 24: /Users/bruceho/spark-1.6/spark-1.6.2-bin-hadoop2.6/bin/load-spark-env.sh: No such file or directory
/Users/me/
在通过pyspark重新安装了pip install pyspark之后,我得到以下错误:
> pyspark
Could not find valid SPARK_HOME while searching ['/Users', '/usr/local/bin']
/usr/local/bin/pyspark: line 24: /bin/load-spark-env.sh: No such file or directory
/usr/local/bin/pyspark: line 77: /bin/spark-submit: No such file
刚刚下载了Spark1.2.1,但是它在程序集项目中编译失败,错误如下:
The requested profile "hadoop-2.6" could not be activated because it does not exist.
[ERROR] Failed to execute goal on project spark-assembly_2.10: Could not resolve dependencies for project org.apache.spark:spark-assembly_2.10:pom:1.2.1: Failure to find
我在centOS中安装了R-3.4.0和rstduio-server1.1.447。在rstudio中,我不能像这样连接spark:
sc <- spark_connect(master = "local")
Error in validate_java_version(master, spark_home) :
Java is required to connect to Spark. Please download and install Java from https://www.java.com/en/
它说Java需要连接到Spark。但
我一直遵循这个教程来安装spark for scala:
但是,当我尝试运行spark-shell时,我在控制台中收到此错误。
/usr/local/spark/bin/spark-shell: line 57: /usr/local/spark/bin/bin/spark-submit: No such file or directory
我的bashrc如下所示:
export PATH = $PATH:/usr/local/spark/bin
export SCALA_HOME=/usr/local/scala/bin
export PYTHONPATH=$SPARK_HOME/pyth
在安装Spark之后,我尝试从安装文件夹运行PySpark:
opt/spark/bin/pyspark
但是我得到了以下错误:
opt/spark/bin/pyspark: line 24: /opt/spark/bin/load-spark-env.sh: No such file or directory
opt/spark/bin/pyspark: line 68: /opt/spark/bin/spark-submit: No such file or directory
opt/spark/bin/pyspark: line 68: exec: /opt/spark/bin/spa
我已经安装了hadoop on ubuntu on virtual box(host os Windows7).I还安装了Apache spark,在.bashrc中配置了SPARK_HOME,并将HADOOP_CONF_DIR添加到了spark-env.sh中。现在,当我启动spark-shell时,它抛出了错误,并且没有初始化spark context,sql context。我是不是在安装中遗漏了什么,而且我想在集群上运行它(设置了Hadoop3节点集群)。
我在我的hortonworks集群上安装了Spark1.6.2和Spark2.0。
这两个版本都安装在由5个节点组成的Hadoop集群的一个节点上。
每次启动spark-shell时,我都会得到:
$ spark-shell
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
当我查看版本时,我得到:
scala> sc.version
res0: String = 1.6.2
如何启动另一个版本(Spark2.0的火花外壳
我正在尝试将Spark Apache和Hadoop安装在同一台机器上。Spark将用于处理数据,Hadoop的HDFS将用于存储数据。我首先安装了spark,它工作得很好。但是,当我开始安装Hadoop并设置JAVA_HOME环境变量时,HDFS工作了,但是当我启动它时,spark崩溃并显示:Files was unexpected at this time.。当我删除JAVA_HOME的时候,Spark又可以工作了,但是HDFS不是这样的。在这种情况下我应该怎么做?
安装星火似乎有两种方法。
通过从,下载预构建的火花版本(例如spark-2.4.5-bin-hadoop2.7.tgz)来安装火花时
- do I need to additionally install `java` command, by installing JRE?
- Do I need to additionally install java compiler `javac` by installing JDK?
- Do I need to additionally install scala compiler? (I guess no, because I sa
我已经在HDInsight上安装了SPARK集群,并且正在尝试使用使用。
我已经在集群创建期间使用了自定义脚本来启用spark集群上的,如下所述。
当我运行记事本时,
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import org.graphframes._
我得到以下错误
<console>:45: error: object graphframes is not a member of package org
import org.graphframes._
我刚刚在全新的Linux Mint安装(所有最新版本)上安装了Anaconda、Apache spark、Pyspark、Scala。
为了测试安装,我尝试在终端中运行spark-submit,但得到以下错误:
File "/home/jessica/anaconda/bin/find_spark_home.py", line 74, in <module>
print(_find_spark_home())
File "/home/jessica/anaconda/bin/find_spark_home.py", line 56, in
我正在寻找最简单的建议来更正我的Spark安装和设置,以便我可以在jupyter笔记本上正确运行: from pyspark import SparkContext
sc = SparkContext() 在jupyter notebook中,我在之前安装spark-2.0.0-bin-hadoop2.7的目录中得到了与file not file错误相关的以下错误。 FileNotFoundError: [Errno 2] No such file or directory: '/Applications/spark-2.0.0-bin-hadoop2.7/./bin/spark-s