0

Could you please let me know how to add retention period to Hive tables. In the below URL I could see partition discovery and retention is not recommended for use on managed tables. I don't understand why it is not recommended.

  1. I have created a table added below properties to the table schema.
  2. Just to be sure I have ran the command MSCK REPAIR TABLE table_name SYNC PARTITIONS
  3. I have inserted the data into the table. As per the retention period, the partitions should be dropped after 30 minutes but nothing was dropped. Am I missing something here? Thank you in advance for your help
'auto.purge'='true',                                                                    'discover.partitions'='true',                                                                                                          
'partition.retention.period'='30m',

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/using-hiveql/content/hive-manage-partitions.html

Vijju
  • 37
  • 6
  • could you pls issue `ALTER TABLE mytable SET TBLPROPERTIES ('discover.partitions'='true'); ` ? And then `invalidate metadata mytable` and then check again? – Koushik Roy Jan 11 '22 at 04:57
  • Hi @KoushikRoy The discover.partition has already been set to true. Invalidate metadata doesn't work for me :( ` cannot recognize input near 'invalidate' 'metadata' ` May I know on which Hive version does this retention period works? – Vijju Jan 11 '22 at 09:23
  • you can use `ANALYZE TABLE mytab COMPUTE STATISTICS` to refresh data. Also, as far as i know, existing partitions will remain as is but newer partitions will be dropped along with data. hive version may be 2.1 or higher. – Koushik Roy Jan 11 '22 at 09:33
  • @KoushikRoy My Hive version is 1.0. The retention.period might not be working due to lower version in my case :) – Vijju Jan 11 '22 at 13:53

0 Answers0