{"id":928,"date":"2021-06-14T23:03:37","date_gmt":"2021-06-14T17:33:37","guid":{"rendered":"https:\/\/techieshouts.com\/?p=928"},"modified":"2022-08-09T19:03:53","modified_gmt":"2022-08-09T13:33:53","slug":"spark-execution-modes","status":"publish","type":"post","link":"https:\/\/techieshouts.com\/home\/spark-execution-modes\/","title":{"rendered":"Spark execution modes"},"content":{"rendered":"\n<p>There are different modes in which we can execute a spark program. This is can be done while running the Spark-submit command to Yarn in the Hadoop cluster<\/p>\n\n\n\n<h3>Mode 1 &#8211; Local<\/h3>\n\n\n\n<p>In this mode, both the driver and executor program will run in the same machine. Whatever logs that are added to both the driver and executor will be shown in the same console.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"classic\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">spark-submit --master local[*] \\\n      --conf spark.dynamicAllocation.enabled=true \\\n      --conf spark.dynamicAllocation.minExecutors=1 \\\n      --conf spark.dynamicAllocation.maxExecutors=30 \\\n      --conf spark.dynamicAllocation.initialExecutors=1 \\\n      --jars JARS_PATH \\\n      sparkrunner.py &lt;program arguments><\/pre>\n\n\n\n<p>Here * in local[*] represents the maximum number of cores.<\/p>\n\n\n\n<h3>Mode 2 &#8211; Yarn Client<\/h3>\n\n\n\n<p>In this mode, the driver will be running in the machine where the spark program is triggered. The executors will be running in different machines in the cluster.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"classic\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">spark-submit --master yarn --deploy-mode client --driver-memory 5G --executor-memory 10G --executor-cores 2 \\\n      --conf spark.dynamicAllocation.enabled=true \\\n      --conf spark.dynamicAllocation.minExecutors=1 \\\n      --conf spark.dynamicAllocation.maxExecutors=30 \\\n      --conf spark.dynamicAllocation.initialExecutors=1 \\\n      --jars JARS_PATH \\\n      sparkrunner.py &lt;program arguments><\/pre>\n\n\n\n<h2>Mode 2 &#8211; Yarn Cluster<\/h2>\n\n\n\n<p>In this mode, we do not have any control over where the driver and executor programs will be running. It is like fire and forget. The program will run the driver portion in any of the cluster machines and the executor portion in some other machines.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"classic\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">spark-submit --master yarn --deploy-mode cluster --driver-memory 5G --executor-memory 10G --executor-cores 2 \\\n      --conf spark.dynamicAllocation.enabled=true \\\n      --conf spark.dynamicAllocation.minExecutors=1 \\\n      --conf spark.dynamicAllocation.maxExecutors=30 \\\n      --conf spark.dynamicAllocation.initialExecutors=1 \\\n      --jars JARS_PATH \\\n      sparkrunner.py &lt;program arguments><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>There are different modes in which we can execute a spark program. This is can be done while running the Spark-submit command to Yarn in the Hadoop cluster Mode 1 &#8211; Local In this mode, both the driver and executor program will run in the same machine. Whatever logs that are added to both the\u2026 <span class=\"read-more\"><a href=\"https:\/\/techieshouts.com\/home\/spark-execution-modes\/\">Read More &raquo;<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[11,136],"tags":[139,138,137],"_links":{"self":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/928"}],"collection":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/comments?post=928"}],"version-history":[{"count":5,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/928\/revisions"}],"predecessor-version":[{"id":945,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/928\/revisions\/945"}],"wp:attachment":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/media?parent=928"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/categories?post=928"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/tags?post=928"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}