Skip to content

Analytics Anvil

Big data, Open Source BI and Analytics technologies.

Tag: subfolders

Using HDFS path and filename as columns in a Hive table

Mar 7, 2016Mar 7, 2016 Posted in hadoop, hiveLeave a comment

A handy feature of Hadoop Hive is the ability to use the filename and path of underlying files as columns in a view or table using the virtual Hive column INPUT__FILE__NAME.  This is particularly handy in the case of external tables where some metadata about files is embedded in the location on HDFS or the… Continue reading Using HDFS path and filename as columns in a Hive table

columnfilenamehdfsHiveINPUT__FILE__NAMEmetadatapathsubfoldersvirtual column
Analytics Anvil Logo

Recent Posts

  • Automatically tagging, captioning and categorising locally stored images using the Azure Computer Vision API
  • Useful queries for the Hive metastore
  • Python + JDBC = Dynamic Hive scripting
  • Useful date formulas for Hive
  • Finding a sequence of events in Hive using analytic functions

Recent Comments

niftimusmaximus on Useful queries for the Hive…
Sampat on Useful queries for the Hive…
Varunkumar Inbaraj on Avoiding “add jar”…
Lakshminarayana Chow… on Avoiding “add jar”…

Archives

  • Oct 2016
  • Aug 2016
  • Jun 2016
  • May 2016
  • Apr 2016
  • Mar 2016
  • Jan 2016
  • Oct 2015
  • Sep 2015
  • Aug 2015
  • Jun 2015
  • Mar 2015
  • Feb 2015
  • Jan 2015
  • Sep 2014

Categories

  • Cloud
  • config
  • CSV
  • Development
  • hadoop
  • hive
  • Integration
  • Machine Learning
  • RDBMS
  • Uncategorized
  • Visualisation

Meta

  • Register
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com
Create a free website or blog at WordPress.com. Analytics Anvil
Cancel
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy