征求postgresql部署的建议 [复制链接]

论坛徽章:: 0

41楼 [报告]

发表于 2008-02-17 17:31 |只看该作者

POSTGRESQL 与水晶报表(PostgreSQL and Crystal Reports HOWTO)

内容来自:http://www.postgresql.org/docs/techdocs.10
pgsql crystal
Technical documentation > Community Articles > PostgreSQL and Crystal Reports HOWTO
Purpose of This Document

I started using Crystal Reports around release 7. Until recently I had used it to report against Microsoft Access, Microsoft Outlook, Microsoft SQL Server, and Sybase databases. When I began reporting against a PostgreSQL database I ran into some quirks that were difficult to find solutions for. I hope this document will help others who must use a Crystal Reports and PostgreSQL combination.
Pre-requisites

1. A working version of Crystal Reports. This document was written using Crystal Reports 9.0 Professional as the development tool. Prior versions of Crystal should support reporting against a PostgreSQL datasource with the exception of stored procedures. Your version of Crystal must support editing of the SQL statements generated by Crystal to report against a stored procedure.
2. I recommend that you install Service Pack 1 and/or search for a hot fix from Crystal. There is a bug in Crystal 9.0 with how Crystal interprets some database column names because of the data type that PostgreSQL uses.
3. This document was written using PostgreSQL 7.X as the database version. Access to a PostgreSQL database is of course required. If you cannot connect to the database please contact your admin or read the PostgreSQL documentation to check your database settings.
4. A PostgreSQL ODBC driver. This document was written using ODBC to connect to the database. A driver can be fetched from http://gborg.postgresql.org/project/psqlodbc/projdisplay.php. I used version psqlodbc-07 03 0200.
5. An ODBC connection configured to your PostgreSQL database.

Data Sources

1. Reporting against tables
2. Reporting against views
3. Reporting against stored procedures
* Without parameters
* With parameters

Reporting against tables

Reporting against tables can be done with cavets. Tables can be added to Crystal Reports using the database expert like any other database BUT must be a member of the public schema. If the table(s) are not a member of the public schema and you try to add them to your report Crystal will give you an error message(Not supported). If you must report against a table in a schema other than the public schema create a view against the table and report off of that view. You can add multiple tables and link them in Crystal as with any other database.
Reporting against views

Reporting against views is the way most reports against Crystal will be done. There is nothing special you need to know or any special tricks when reporting against views. When creating Views however I recommend that you specify a data type for the column name to ensure that Crystal will be able to read the column name if you are renaming a column or assigning a name to a column.. An example of this is the following: CREATE OR REPLACE VIEW myschema.view_intercultural AS SELECT d.client_name, 'Intercultural Overall'::bpchar AS question FROM myschema.international d; Don't use ::text in place of ::bpchar in the above example, otherwise Crystal will not let you choose the column for grouping.
Reporting against stored procedures
Without parameters

This is the real reason I wrote this mini howto. Reporting against stored procedures in PostgreSQL is much different than what I had experienced with other databases. In order to report against a stored procedure you must be able to tell Crystal what data fields and their types it can expect to receive from the stored procedure. The only specify the data fields is to manually enter the SQL statement into the Crystal report. The syntax is exactly the same as if you were calling the stored procedure using psql with the exception of formatting of any parameters you are passing to the stored procedure. In Crystal 9.0 this is done by selecting from the menu on top Database->Select your ODBC datasource that you set up->Add Command. This will open a pop up window where you can enter the SQL statement directly. The format of the statement for a stored procedure without parameters is the following: SELECT field1, field2 FROM myschema.mystoredprocedure AS (field1 varchar(100), field2 int4) This is SQL for creating the stored procedure referenced in the report: CREATE OR REPLACE FUNCTION myschema.mystoredprocedure RETURNS SETOF record AS ' DECLARE r record; BEGIN for r in EXECUTE \'SELECT t.field1, t.field2 FROM myschema.mytable t ;\' loop return next r; end loop; return; END; ' LANGUAGE 'plpgsql' STABLE; The SQL for the table itself is the following: CREATE TABLE myschema.mytable ( field1 varchar(100), field2 int4 );
With parameters

In Crystal 9.0 this is done by selecting from the menu on top Database->Select your ODBC datasource that you set up->Add Command. This will open a pop up window where you can enter the SQL statement directly. The Create button on the right side of the screen is used to create any parameters that you wish to pass from Crystal to your stored procedure. The format of the statement for a stored procedure with parameters is the following: SELECT * FROM myschema.mytable({?parameter1},{?parameter2}) as recs( field1 varchar(100), field2 int4, submit_date date) This is SQL for creating the stored procedure referenced in the report: CREATE OR REPLACE FUNCTION myschema.mystoredprocedure(date, date) RETURNS SETOF record AS ' DECLARE r record; from_date ALIAS FOR $1; to_date ALIAS FOR $2; BEGIN for r in EXECUTE \'SELECT t.field1, t.field2, t.submit_date FROM myschema.mytable t WHERE t.submit_date >=\'||quote_literal( from_date )||\' and t.submit_date <=\'||quote_literal( to_date )||\' ;\' loop return next r; end loop; return; END; ' LANGUAGE 'plpgsql' STABLE; The SQL for the table itself is the following: CREATE TABLE myschema.mytable ( field1 varchar(100), field2 int4. submit_date date );

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

linuxdee

白手起家

论坛徽章:: 0

42楼 [报告]

发表于 2008-02-18 16:59 |只看该作者

数据仓库与数据挖掘

内容见附件（word文档），或见本人Blog

[ 本帖最后由 linuxdee 于 2008-2-18 17:00 编辑 ]

datamining.rar

75.94 KB, 下载次数: 330

数据仓库与数据挖掘

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

ts_keta

白手起家

论坛徽章:: 0

43楼 [报告]

发表于 2008-02-18 17:46 |只看该作者

我想到几个思路，可以供楼主参考下：
1：
8K个MSSQL的数据库结构基本一到致，每级部门做一个统计汇总的功能，下一级系统往相应的上一级系统上报（传输）统计数据。
开发工作量为5次统计和5次数据传输
数据库可以利用现有的MSSQL，只需将原有数据统计汇总即可，不用购买新的数据库
数据的传输方式犹为重要，如果8K个DB结点已能够互访（至少每一级与其上下级能实现数据沟通），那么可以将网络开销降至最低。否则要在数据传输上下一番功夫。
2：
针对这一需求只做一个中心数据仓库，将所有的第六级部门（4000个左右）的数据抽取至该仓库内，一级级地汇总，可依次得到六个部门级别的汇总数据。
开发工作量为5次统计和1次数据传输
由于采用数据仓库，最好能使用大型数据库和对BI支持比较好的数据库，DB2，oracle等都可以。
如何将4000个第六级部门的数据传输至中心数据仓库（也就是所谓的数据抽取）是关键。
该方法成本低，开销小，周期短，有一个统一的系统为该项统计业务做支持，开发维护实施都比较方便。不过由于是只针对这一需求而开发的数据仓库，故扩展性不强，不能和其它的数据和业务联系起来，功能单一。
3，基于第2种思路的扩展：
同样是针对该项统计业务要开发一个中心数据仓库来完成该项业务，不同的地方在于，在创建数据仓库前需要规划好一个强大的数据仓库，尽可能的将公司内所有的数据和业务都联系起来，以实现将来可能需要的BI功能的扩展，不过不必实现这些功能。（唯一实现的只是楼主需要的统计业务）
这样做的好处是系统功能强大，扩展性好，为公司将来可能做一个基于全局的数据仓库奠定了基础。
难点和时间的花费在于前期的规划和需求。
缺点也显而易见，主要就是需求和规划耗时长。而成本只会比思路2高一些，不过不是特别多。

希望能给楼主提供有用的思路~

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

linuxdee

白手起家

论坛徽章:: 0

44楼 [报告]

发表于 2008-02-18 19:53 |只看该作者

谢谢建议

因为除了年底的统计,相关业务我也要想搞数据集中模式下的应用.

[ 本帖最后由 linuxdee 于 2008-2-18 19:59 编辑 ]

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

huain

白手起家

论坛徽章:: 0

45楼 [报告]

发表于 2008-02-18 20:42 |只看该作者

联系google看看

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

arust

家境小康

论坛徽章:: 0

46楼 [报告]

发表于 2008-02-19 10:35 |只看该作者

PostgreSQL 8.3版本正式发布

2008年2月4日，纽约：PostgreSQL全球开发组宣布高性能的面向目标的数据库管理系统 PostgreSQL8.3版发布。来自18国家和地区的几十名PostgreSQL开发人员为新版本的PostgreSQL贡献了超过280处更新，这个版本的一系列新的和改进的特性，对应用程序设计人员、数据库管理员、数据库用户都极大地增强了可用性。)

“开源数据库PostgreSQL持续的发展给了用户一个发布非商业化数据库的选择，可以使用户节约费用、提升性能和提高生产效率。PostgreSQL 8.3版是一个令人印象深刻的新版本，我们鼓励我们世界各地的客户去使用它”， Sun公司的执行副总裁Rich Green先生说道。

PostgreSQL 8.3版比以前的版本提供了更强大的性能一致性，它保证了一年52周，每周７天，不论是负载高峰时刻还是非高峰时刻，均能提供类似近期在性能测试中表现的一样的高性能。主要的性能增强内容包括：

* 堆内索引技术（HOT)，它对更新频繁的表最多可减少3/4的系统开销。
* 负载分布式的检查点功能，减少了检查点功能对系统响应时间的影响。
* 可自动执行的后台写入程序的自我调节功能。
* 对部分事务执行时的异步提交功能可大幅加快响应时间。

这些改变也大大地提高了数据库的事务处理的能力，根据不同的工作负载，可提高性能5%至30%之间。

“PostgreSQL变得更加智能了”，PostgreSQL核心开发组成员Bruce Momjian解释说。

PostgreSQL 是第一个实现同步扫描的开源数据库，该功能极大地减少了数据挖掘时的Ｉ／Ｏ次数。Windows平台的开发组现在已经可以用 Visual C++编译PostgreSQL，这也提高了PostgreSQL在Windows平台的稳定性和性能，也使Windows的开发人员更方便参与 PostgreSQL的相关项目。新增了新的日志选项和减少了系统状态收集器的开销也使得监控服务器的状态变得更加容易。

当然，对应用程序开发人员来说，没有提供很多新特性就不是一个全面更新的PostgreSQL了，新的特性包括：
* 对于SQL/XML标准的支持，包括XML的导出。
* 全文搜索：高级全文搜索工具TSearch2现在已嵌入系统核心中，同时管理更加方便，增加了新的语言和字典。
* GSSAPI和SSPI认证的支持。
* 新的数据类型：UUIDs、ENUMs和复合类型的数组。

“我们使用PostgreSQL服务器对我们达300G的用户数据库执行每秒超过18000次的查询，并且这些用户数据每月还在上升，”社交类（SNS）网站 MyYearbook.com的ＣＴＯ技术总监Gavin Roy说道，“我们对HOT功能、共享缓冲的扫描提升和集成Tsearch索引功能都感觉很兴奋。初步的测试显示8.3版本将提高我们数据库的负载能力，缩短系统响应时间。我们真的很期待新版本的发布。”

除了很多的新功能外，一些附加的模块也已完善并随着8.3版本的发布而发布1.0版本。这些模块包括SNMP支持、PL/Proxy、pgPool2，以及一个图形化的存储过程调试器Bucardo，还有名为pgBouncer的连接缓冲池。这些模块现在都已可以使用。

在8.3版本中还有其他一系列辅助功能和改进的地方，这些都使PostgreSQL成为一个更加强大的数据库平台，可浏览相关新闻发布稿以浏览更多细节： http://www.postgresql.org/about/press/presskit83

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

linuxdee

白手起家

论坛徽章:: 0

47楼 [报告]

发表于 2008-02-19 13:19 |只看该作者

从MSSQL7到PostgreSQL7 的迁移

文章来自:http://www.postgresql.org/docs/techdocs.28
Converting your data from MS SQL Server 7 to PostgreSQL 7.1.x
by Ryan C. Bonham <ryan@ryanbonham.com>
v1.00, Last updated 5th January 2002

Prerequisites

* MS SQL 7 (of course)
* RedHat Linux 7.1
* PostgreSQL 7.1 RPM's
* PostgreSQL ODBC Client

Directions
Follow the steps below to convert your Microsoft SQL Server 7 data to PostgreSQL.
Anything following a "#" is the exact command to type at the shell prompt.

From Your Redhat Machine

1. As root, create a postgres user.
   # adduser postgres

2. Install the PostgreSQL RPM's on your machine.

3. Start PostgreSQL for the first time.
   # service PostgreSQL start

4. Set PostgreSQL to run at system Startup.
   # chkconfig --add PostgreSQL

5. Login as the Postgres User
   # su postgres

6. Create a Blank Database
   # createdb Databasename

From your SQL 7 Server or a machine with the SQL administration tools installed

1. Install the PostgreSQL ODBC Drivers

2. Create A New User or System DSN (Data Source)
   On Windows 2000, you can find this under the "Administrative Tools" Control Panel

      1. When Prompted for Driver to use, select "PostgreSQL"

      2. For Data Source enter the name you wish to give this connection.  (e.g. Pgsql)

      3. For Database enter the name you used in Step 6 above

      4. For Server enter the IP or name of your PostgreSQL Database Server

      5. Leave the Username and password blank

      6. Under the Driver Options Select "Parse Statements" and Uncheck "Bools as char"

      7. Under "DataSource" Options select "Row Versioning"

3. Open the SQL 7 Enterprise Manager

4. Expand your server and select "Data Transformation Services"

5. Open the "Local Packages"

6. Select New package from the Action Menu

7. Select "Microsoft OLE DB Provider for SQL Server" from the Data Menu

      1. Pick Your SQL Server and database you wish to export and press ok

8. Select "Other Connection" from the Data menu

      1. For Data Source select "PostgreSQL"

      2. Select the DSN we created above (Pgsql)

      3. Enter your username (postgres)

9. Select both Connections you just created

  10. Select "Add Transformation" from the Workflow menu

  11. Double click the Transformation Line that was just created

      1. On the sources tab select the table you wish to export

      2. On the destination tab select Create new

      3. Check the DataTypes and Column Lengths.  Make any corrections and press ok

      4. On the Transformation screen make sure the source columns are pointing to the correct destination columns.  You can change the Column mappings by selecting the appropriate columns and using the delete and insert buttons on the screen.

      5. Check the options on the Advanced Tab.  You shouldn't need to make changes

      6. Press Ok

  12. Select Save from the package menu

  13. Select Run from the package menu

  14. Your data should now be exported.  If you get any errors double check the data type and lengths are correct on the table you created in Step 11

[ 本帖最后由 linuxdee 于 2008-2-19 13:24 编辑 ]

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

goodhawk

家境小康

论坛徽章:: 0

48楼 [报告]

发表于 2008-02-19 21:25 |只看该作者

Oracle肯定能胜任这样的工作的。

1，采用pg，oracle并行思路不错，你的第2套方案不错，采用1个中心db，10 分db

2，如果最末级的msdb 有人管理，其实不切换也是可以同步，并且集中起来的，MTS 就是干这个活的。 Oracle的透明网关也是干这个的，进行数据采集是没有问题的。

3，如果想完全切换过来短时间估计有困难，因此建议你先把后面集中的做好，然后试点进行一些部门进行数据采集。

4，关于备份，存储 IBM，EMC 都有好多方式，扯得还很牛。

5，OS 选择个人还是很喜欢solaris的。

6，开发的话，其实用php，java 都OK。 Java 似乎更快些。

7，我看了你关于crystal report的东东，本人强烈鄙视crystal report的东西，因为这个家伙把很简单的东西搞得超复杂，一个负载均衡搞得浑天暗地，report server 也是强烈的不稳定，因为曾经把这个东西（report tool ＋ report server）用了很久，个人的感受，因此强烈鄙视一把。