Category: BigData
What is UDF function in hive
UDF stands for User-Defined Function in Hive. A UDF is a function that can be defined and used by a user to perform operations on data stored in Hive tables…. Read more »
what is acid property in hive and how to update data in hive
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that ensure that database transactions are processed reliably. Atomicity requires that either all the operations in a transaction are completed… Read more »
Types of tables in hive and table creation in hive
Hive is a data warehousing and SQL-like query language component of Apache Hadoop. It provides a way to manage and query structured data stored in the Hadoop Distributed File System… Read more »
What is acid property in hive explain with example
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that ensure reliable and secure transaction processing in databases. The ACID properties provide a guarantee that transactions in a database… Read more »
Data types supported by Hive
Primitive data types: a. Numeric types: i. TINYINT ii. SMALLINT iii. INT iv. BIGINT v. FLOAT vi. DOUBLE b. String types: i. STRING ii. CHAR iii. VARCHAR c. Date and… Read more »
What is Vectorization in Hive with example
Vectorization in Hive is a performance optimization technique that allows Hive to process large amounts of data more efficiently. It works by processing multiple rows of data in a single… Read more »
Whast is SMB (Sort-Merge-Bucket) join in hive
SMB (Sort-Merge-Bucket) join in Hive is a type of join operation that is used when joining two large datasets that cannot fit into memory. The SMB join is performed in… Read more »
explains map-side join in hive
Map-side join in Hive is a technique used to improve the performance of join operations in large-scale data processing. In a map-side join, the join operation is performed by the… Read more »
Which are optimisation tech are available in hive
Hive provides several optimization techniques that can be used to improve query performance. Here are some of the most commonly used optimization techniques in Hive: These are some of the… Read more »