{"id":934,"date":"2021-06-15T23:49:25","date_gmt":"2021-06-15T18:19:25","guid":{"rendered":"https:\/\/techieshouts.com\/?p=934"},"modified":"2022-08-09T19:03:48","modified_gmt":"2022-08-09T13:33:48","slug":"spark-reading-from-hive-table","status":"publish","type":"post","link":"https:\/\/techieshouts.com\/home\/spark-reading-from-hive-table\/","title":{"rendered":"Spark reading from Hive table"},"content":{"rendered":"\n<p>In this post, we will see how to read the data from the hive table using Spark. Spark with its in-memory computation will help to perform the data processing much faster compared to the classical Map Reduce program.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from pyspark import SparkConf, SparkContext\nfrom pyspark.sql import SQLContext, Row\nfrom pyspark.sql.types import *\nfrom pyspark.sql.functions import lit\nfrom pyspark.sql import HiveContext\nfrom pyspark.sql import SparkSession\n\nappname = \"Application name\"\nspark = SparkSession.builder.appName(appname).enableHiveSupport().getOrCreate()\ncountsql = \"select count(*) as tot_cnt from database.tablename\"\nprint(\"Count SQL\")\nprint(\"---------\")\nprint(countsql)\nrec_cnt = spark.sql(countsql).first()[0]\nprint(rec_cnt)<\/pre>\n\n\n\n<p>The above program will get the count of a hive table and print the same<\/p>\n\n\n\n<p>Also read<\/p>\n\n\n\n<ol><li><a rel=\"noreferrer noopener\" aria-label=\"Spark execution modes (opens in a new tab)\" href=\"https:\/\/techieshouts.com\/spark-execution-modes\/\" target=\"_blank\">Spark execution modes<\/a><\/li><li><a href=\"https:\/\/techieshouts.com\/spark-reading-from-oracle\">Spark reading from Oracle<\/a><\/li><\/ol>\n","protected":false},"excerpt":{"rendered":"<p>In this post, we will see how to read the data from the hive table using Spark. Spark with its in-memory computation will help to perform the data processing much faster compared to the classical Map Reduce program. The above program will get the count of a hive table and print the same Also read\u2026 <span class=\"read-more\"><a href=\"https:\/\/techieshouts.com\/home\/spark-reading-from-hive-table\/\">Read More &raquo;<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[136,11],"tags":[142,141,140],"_links":{"self":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/934"}],"collection":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/comments?post=934"}],"version-history":[{"count":3,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/934\/revisions"}],"predecessor-version":[{"id":948,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/posts\/934\/revisions\/948"}],"wp:attachment":[{"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/media?parent=934"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/categories?post=934"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techieshouts.com\/home\/wp-json\/wp\/v2\/tags?post=934"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}