我一直在尝试围绕HIVE中的EXTERNAL表概念
CREATE EXTERNAL TABLE IF NOT EXISTS MovieData
(id INT, title STRING,releasedate date, videodate date,
URL STRING,unknown TINYINT, Action TINYINT, Adventure TINYINT,
Animation TINYINT,Children TINYINT, Comedy TINYINT, Crime TINYINT,
Documentary TINYINT, Drama TINYINT, Fantasy TINYINT,
Film-Noir TINYINT, Horror TINYINT, Musical TINYINT,
Mystery TINYINT, Romance TINYINT, Sci-Fi TINYINT,
Thriller TINYINT, War TINYINT, Western TINYINT)
COMMENT 'This is a list of movies and its genre'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
使用上述语句创建了一个表,然后使用LOAD语句来填充数据。
LOAD DATA LOCAL INPATH '/home/ubuntu/MovieLens.txt' INTO TABLE MovieData;
下次我在 HIVE 中删除表并重新创建它并加载数据...但是当我对表执行COUNT操作时,我得到的值是我加载的文件中存在的值的两倍。
我通读了几篇文章,EXTERNAL表不会删除数据,而是仅从 HIVE 元存储中删除模式...外部表
你能告诉我为什么HIVE会这样吗?