在核心和线程使用方面激发本地模式与独立群集

发布于 2025-02-02 19:33:06 字数 965 浏览 3 评论 0原文

IM在Pyspark 本地模式和独立模式中比较

local ：

findspark.init('C:\spark\spark-3.0.3-bin-hadoop2.7')
conf=SparkConf()
conf.setMaster("local[*]")
conf.setAppName('firstapp')

sc = SparkContext(conf=conf)
spark = SparkSession(sc)

standalone ：

findspark.init('C:\spark\spark-3.0.3-bin-hadoop2.7')
conf=SparkConf()
conf.setMaster("spark://127.0.0.2:7077")
conf.setAppName('firstapp')

sc = SparkContext(conf=conf)
spark = SparkSession(sc)

加上启动主人和使用工人的工人：

主人 bin \ spark-class2.cmd org.apache.spark.deploy.master.master.master

worker 多次 bin \ spark -class2.cmd org.apache.spark.deploy.worker.worker.worker -c 1 -m 1g spark：//127.0.0.1.1：7077其中'1'是'1'的含义一个核心和'1G '平均1GB或RAM。

我的问题是：本地模式和独立模式在使用线程和内核方面有什么区别？

原文

im comparing between pyspark local mode and standalone mode where

local :

findspark.init('C:\spark\spark-3.0.3-bin-hadoop2.7')
conf=SparkConf()
conf.setMaster("local[*]")
conf.setAppName('firstapp')

sc = SparkContext(conf=conf)
spark = SparkSession(sc)

standalone :

findspark.init('C:\spark\spark-3.0.3-bin-hadoop2.7')
conf=SparkConf()
conf.setMaster("spark://127.0.0.2:7077")
conf.setAppName('firstapp')

sc = SparkContext(conf=conf)
spark = SparkSession(sc)

plus starting the Master and the workers using :

Master
bin\spark-class2.cmd org.apache.spark.deploy.master.Master

Worker multiple times depending on the number of workers
bin\spark-class2.cmd org.apache.spark.deploy.worker.Worker -c 1 -m 1G spark://127.0.0.1:7077 where '1' mean one core and '1G' mean 1gb or Ram.

my question is : what is the difference between local mode and standalone mode in term of the usage of threads and cores ?

分享到QQ

分享到微博