In the other blog, we saw how to read a hive table in Spark. In this blog, we will see how to read data from Oracle This will load the data from the Oracle table to the data frame. After that, we can perform any operation as per the program needs We need to pass… Read More »
In this post, we will see how to read the data from the hive table using Spark. Spark with its in-memory computation will help to perform the data processing much faster compared to the classical Map Reduce program. The above program will get the count of a hive table and print the same Also read… Read More »
There are different modes in which we can execute a spark program. This is can be done while running the Spark-submit command to Yarn in the Hadoop cluster Mode 1 – Local In this mode, both the driver and executor program will run in the same machine. Whatever logs that are added to both the… Read More »
In this post, we will see how to establish a passwordless SSH connection between two Linux machines. This can be done by using the Public and Private keys. Once the keys are set up, the authentication will be done using these keys instead of the password. Let us consider there are two Linux machines/servers ServerA… Read More »
In this blog, we will see how to archive/delete a file in HDFS if it is n days older. We can use this to check for any number of days. For example, let us say that we need to monitor an HDFS folder and delete the files when they become 7 days older.
In Python, we can merge two dictionary objects very easily. First, we create two dictionaries then we can merge and sort the dictionaries based on the values. In python 2.x we can use the following code, Python 2.x In the below example, the two dictionaries are created and then merge the dictionary keys with the… Read More »
In the following code, we are assigning default values to the dictionary. First, we can create the dictionary and assigning the values to the names. Then, we are trying to fetch the dictionary value based on the key. If there is no such key, we will assign a default value.
Let us take a real-time example of the banking system. The bank wants to offer a loan only to the active premium customers who are not on the defaulter’s list. For this, we are using the following code in python. Method: 1 Method: 2 In this example, if the bank gives a loan to the… Read More »
Basic TEradata Query is a powerful utility in Teradata for various reasons. You can write the data of a table into a file using the BTEQ export utility. You can also use it for executing conditional statements based on certain logic, for executing all kinds of DML statements. In this post, we will see how… Read More »
Looping through the list is a very useful and much-needed function for every scripting and programming language. The most common looping method is the programming world is the for loop. In this blog, we will see how to iterate over strings separated by a delimiter in a shell script. For loop Syntax Let us see… Read More »