免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1502 | 回复: 0
打印 上一主题 下一主题

MySQL as a Data Warehousing Platform [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2008-06-16 23:07 |只看该作者 |倒序浏览

By nisingh
(Open Source Analytics Evangelist) Posted 3/30/2006
In the next post, we'll start
doing some hands on. But before that we need to select a Data Warehouse
Platform. Don't worry, I am not going to nudge you towards some
commercial "Data Warehouse Solution" here! As promised, it will be open
source and free.
Since a Data Warehouse is essentially a
database, we could use any standard database for the purpose. But since
we aren't making just a toy to impress people with, let's stop for a
moment and look at what we need. So lets go over that...
Unlike
an OLTP system that needs to write very fast, we would be better off
with a database that can read lots of data faster because that is what
we would be doing most of the time. (See previous post
Database vs Data Warehouse
).  
We
would also like the system to be production grade, robust, 24x7, and
common enough so that finding expertise is not impossible. Do note that
24x7 availability may not be as critical for a DW as it would be for
the OLTP servicing your bank's ATM or rerouting your flights.
Demonstrated
ability to handle huge (really HUGE) amounts of data is a must as well,
and it should allow high-speed bulk loading in batch jobs. And it would
be great if there was a way to split very huge tables into smaller ones
(like Oracle partitioning).
And then you would need transaction support.  Right?  Wrong.
Huh?
Well, look at it this way. If your OLTP system did not have
transactions, we'd be dead. But a DW is not a transaction oriented
system. It would be more efficient to use a non-transactional database
as you do away with the overheads required for enforcing transactions.
This is a debatable point, but a non-transactional database would be a
lot faster than a transactional database in simple read write tasks.
MySQL, configured right, makes an obvious and affordable choice for Data Warehousing.  
MySQL's
non-transactional MyISAM db engine is one of the fastest around when it
comes to querying large amounts of data. You don't get transaction
support, and that's alright because you don't need transactions (and
the huge overheads they bring). And its bulk insert features (LOAD DATA
INFILE
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP