SQL Server : Data Distribution

greenspun.com : LUSENET : SQL Server Database Administration : One Thread

Hello!

For example, we have a table TX(c1 integer, c2 date) and a query Q which works on TX. Assume Q is just working on a restricted c2 range and tuples are ordered on c1. A and B are the number of all tuples in TX and the number of tuples which satifies c2 range respectively.

Below query will process B tuples.

Q :

Select ... From TX Where c2 between date1 and date2

What I want to do is partitioning queries by appending range predicates to their where clause. Then, I want to execute new instances of query on different replica of database and to collect the results. After appending partitioning condition, Q will look like:

Q :

Select ... From TX Where c2 between date1 and date2 AND c1 between LoBound and UpBound

But, to equally distribute the load among several databases, almost equal number of tuples which satisfy c2 range must fall in c1 ranges. And c1 ranges must be determined automatically by application program which sends partitioned queries to replicas.

And the questions:

Is it possible to determine the distribution of one column in another column?

How can I access (store the result in a structure e.g. temp table) the result DBCC SHOW_CONFIG command ?

How can I interpret the value in sysindexes.statblob attribute?

Thanks in advance.

Kind regards, Fuat Akal.

-- Anonymous, July 01, 2001

Answers

Fuat,

Here are answers to your questions.

Q1. Is it possible to determine the distribution of one column in another column?

A1. Not by using DBCC SHOW_STATISTICS. DBCC SHOW STATISTICS shows the distribution based on a single column, but you can query like so:

select c1, count (*) as c1Count from TX where c2 between date1 and date2 group by c1

Q2. How can I access (store the result in a structure e.g. temp table) the result DBCC SHOW_CONFIG command?

A2. There is no DBCC SHOW_CONFIG command, but you can store the results of some (but not all) DBCC commands by using one of the following two methods.

The INSERT INTO #temptable EXEC ('DBCC ') method, which works for useroptions for example.

drop table #tb_setopts go CREATE TABLE #tb_setopts (SetOptName varchar(35) NOT NULL ,SetOptValue varchar(35) null) INSERT INTO #tb_setopts (SetOptName,SetOptValue) EXEC('dbcc useroptions') select * from #tb_setopts

The xp_cmdshell method. To make this work you'll need to use xp_cmdshell and ISQL as follows :-

-- Note that all quotes are single quotes except the ones on the -Q option, which are double quotes DROP TABLE #maint go DECLARE @SQLSTR varchar(255) SELECT @SQLSTR = 'ISQL -E -Q"dbcc checkdb(master)"' CREATE TABLE #maint (Results varchar(255) NOT NULL) INSERT INTO #maint(Results) EXEC('master..xp_cmdshell ''ISQL -E - Q"dbcc checkdb(master)"''') select * from #maint

Q3. How can I interpret the value in sysindexes.statblob attribute?

A3. The statblob column is basically what you get when you run DBCC SHOW_STATISTICS. Specifically it contains the steps and density information. You can read more about it at http://msdn.microsoft.com/library/default.asp?url=/library/en- us/dnsql2k/html/statquery.asp.

Hope this helps,

Eric

-- Anonymous, July 01, 2001


Moderation questions? read the FAQ