From PostgreSQL wiki
I like to initiate the this discussion , though I have already submiited this idea to some AFS member of PostgreSQL Development team.
BigGreSQL = PostgreSQL + Big Data (Hive engine integrated)
Today the Market is growing such way that typical SQL engine will not able to support Big data which are in MultiTB plus Schema is loosly defined.
I am envisioning that it is very good opertunity for OpenSource Group - Postgres to capture it first.
What is benefit. Hive code is also based on same principle of Postgres Cost based engine for SQL plan generation. while creating execution , it is creating Map-reduce Job. I have already explained same in document shared that its catalog look like more as any RDBMS catalog.
Today in most of the cases we are ending up using Hadoop/Hive as Intermediate stage where data has been brought to HDFS and then process using either Hive and MAP-R job the push back. It leads data integratity issue also as Dimesion tables are changing with every load.
If Postgres Catalog enhaced to capature metadata for Big Data it will able handle to very smoothly.
I like to get community view on this. So that this can be developed.
Jayant Dani Solution Archiect and Data Architect Head of Technology Excellence group in Data Managament and Hadoop. Tata consultancy Services, Contact: Jayant.Dani@tcs.com
today I have come across the artical from IBM talking about clustered filesystem. It is nothing but IBM implementation of GFS and it is trying tame apache Hadoop , but then it will be not do difficult for them to integrate it with DB2, while I am suggesing on this for postgres to adopt it at earliest as opensource engine for M-R implementation for SQL is available.
Hope I will see more traction in this forum and take it towards next level.