Data Services: April 2008

Tuesday, April 29, 2008

SQL Injection Attack

As I mentioned in one of my earlier posts, using dynamic SQL is adopted as an easy way of coding. Improper use of dynamic TSQL creates a lot of performance issues because of the way it is treated by the query engine and not all dynamic TSQL queries are possible for caching. More than performance, it is important to avoid dynamic sql as much as possible to avoid the possible security threat of SQL injection. Now, there is a mass SQL Injection going on which has already affected a large number of web pages worldwide. As per F-Secure, a search string searches for all the text fields in the database, and adds a JavaScript snippet so that the web pages display the links automatically. We can read the complete story here – http://www.f-secure.com/weblog/archives/00001427.html. We all know this is an exploitation of a feature available and not a flaw in IIS or SQL Server, and the attack happened only because of bad coding practices. Let us adopt the best coding practices and make our application secure. Performance is important, but I do feel that security is more important than performance.

Monday, April 28, 2008

Capacity Planning - Planning the Disk Subsystem

Planning the disk subsystem for a database infrastructure is the process of deciding the proper RAID strategy that satisfies the optimal throughput requirements and the appropriate storage requirements. Usually, the best practice guidelines insist five separate physical-drive sets with a logical drive each mapping for the requirements of Operating System, Transaction Logs of the production database, the data files of the production database, TempDb and lastly, the backup files storage. Before going through each of these, let us discuss about the various standard RAID configurations and their characteristics. The following table shows the different standard RAID configurations and their characteristics. (Click for a larger display).

We can see that, although the fault tolerant RAID configurations increase the number of IOs per second, we can increase the number of physical disks so that the IOs get divided among the physical disks. This means that we can increase the number of physical disks to increase the throughput. However, there will be limitations for the maximum physical throughput for individual disks such as 300IOps, and also the throughput will be dependant on the throughput limitation of the IO bus.
Requirements for Different Components
Let us take the main components for a SQL Server instance and see how to design the disk requirements of each. As mentioned above, usually, the best practice guidelines insist five separate physical-drive sets with a logical drive each mapping for the requirements of Operating System, Transaction Logs of the production database, the data files of the production database, TempDb and lastly, the backup files storage. The operating system and the SQL Serve program components don’t have very high throughput requirements or storage requirements. The main thing to consider here should be the recoverability. Considering the requirements, we can assign a RAID1 mirror for the operating systems and SQL Server program components so that we can easily recover in case of a failure. For the production database data files, we know that there will be very large storage requirements and high throughput requirements. From the RAID specifications above, we can easily make out that RAID1 cannot be used because of lack of high storage capacity as well as RAID0 cannot be used because it does not offer fault tolerance. Hence, the choice comes between RAID5 and RAID10. RAID5 and RAID10 offer good read while RAID5 is a bad choice for write. Depending upon our application, whether it is an OLTP application or an OLAP application, we can have a choice between these two.
Meeting the Throughput Requirements
To explain the throughput calculations for a database application, let us assume that the client technical team insists that the total disk IO should not exceed 85%. Consider that we have 300IOps disks. This means that each of the individual disks should not exceed 85% of its throughput, i.e. the maximum throughput of an individual disk should not exceed 255IOps. Now, let us assume that, during the peak hours, our application has approximately 600 reads and 200 writes per second. Assuming that we are considering only fault tolerant RAID configurations, let us take each possible configuration to satisfy the requirements. Let us assume that we use RAID1; we know that for RAID1 the Throughput = (Reads + (2 * Writes))/2 = (600 + 400)/2 = 500IOps. As per the requirements, the maximum throughput we can afford for an individual physical disk is 255, while our application with RAID1 will have 500IOps. Hence, we can conclude that RAID1 is not possible for the data files of our application. Now let us assume that we use RAID5; we know that for RAID5 the Throughput = (Reads + (4 * Writes))/No. of Disks. That means, No. of Disks = (600 + 800)/ Throughput; that is No. of Disks = 1400/255 ~= 6. This means that, we require at least 6 disks to satisfy the throughput requirements, if we are adopting a RAID5 configuration for the data files. Now, let us assume that we use RAID10; we know that for RAID10 the Throughput = (Reads + (2 * Writes))/No. of Disks; So, No. of Disks = (Reads + (2 * Writes))/ Throughput. Replacing the variables, No. of Disks = (600+400)/255 ~=4. This means that, we require only 4 disks to satisfy the throughput requirements, if we are adopting a RAID10 configuration for the data files. We have to notice that, there cannot be an odd number of disks for a RAID10 configuration. Even though we calculate that the total disks required is 4.5, we need to have 6 disks to configure the RAID.
Meeting the Storage Requirements
Once we decide the throughput requirements, and the number of disks required to satisfy the application throughput, we need to consider the storage requirements for the application. During the throughput calculation, the main focus was the number of disks required to satisfy the application throughput requirements, while in storage requirements calculations, we consider the capacity of the each disk. Here also, let us assume that the client insists that only 85% of the total disk capacity should be utilized at any time. This means, at any point of time there should be at least 15% free space on the disk. Consider that we are expecting our data files to have a total size of 115GB, we require total space of 115/.85 ~= 136GB. Assuming that we are considering only fault tolerant RAID configurations as above, let us take each possible configuration to satisfy the total 136GB requirement. Let us assume that we use RAID1 even though we ruled out the possibility of using RAID1 because it doesn’t satisfy our throughput requirements. We know that in a RAID1 configuration, we can have only 2 disks. This means each disk should have an individual storage capacity of 136/2 = 68GB. Now let us assume that we use RAID5; we know that in a RAID5 configuration, one physical disk will be used for parity purposes. During the throughput calculations, we concluded that we require at least 6 disks to satisfy the throughput requirements, if we are going for a RAID5 configuration. Hence, the minimum storage capacity for individual disk = 136/5 ~=28GB. Now let us assume that we use RAID10. During the throughput calculations, we concluded that we require at least 4 disks to satisfy the throughput requirements, if we are going for a RAID10 configuration. Among these 4 disks, only 2 can be used for storage purposes because of the mirrors. This means that the individual minimum storage capacity for each disk should be 136/2 ~= 68GB.

Default Trace in SQL Server 2005

There have been always a few questions on keeping the default trace enabled on a SQL Server 2005 production box. Well, personally I do feel that the trace is a very handy tool with which you can find out things that have happened. This is especially true in detecting changes especially in development or testing environments where the dbas don’t have much control on the activities. By the way, tracing out the culprit will not be enough successful if there are shared logins being used. (DONT let people use shared logins; it will trouble you sooner or later). I found the default trace mostly beneficial for environments where there aren’t many experienced DBAs and there isn’t a strong or standard security measures or monitoring infrastructure. Here, the DBAs normally look back to see what happened, and only when someone created some real mess. There have been some discussions on the performance impacts of letting the default trace run on a production environment. I haven’t seen much performance benefits on my environments by disabling the trace, but still I didn’t do any load test to confirm this. There is a post from Linchi I recently found on some load test results. You can find the post at http://sqlblog.com/blogs/linchi_shea/archive/2007/01/16/performance-impact-of-enabling-page-checksum-and-default-trace.aspx?CommentPosted=true. I can’t say that Linchi's testing would be a final one, I recommend you to do your own load testing based upon your applications before coming to a final conclusion.

Wednesday, April 23, 2008

Capacity Planning - Sizing a Production Database Server

The process of designing a database server infrastructure to cater the needs of the application or business before it goes live in to a production environment is known as sizing.
It requires a lot of skilled requirement collection and analysis to project the requirements of the application well before it is deployed. Following are a few generic techniques that are being use to size a database server infrastructure.

§ Capacity Planning Interviews
One of the best techniques to understand an application well before it is deployed is through the interactions with the stakeholders. The stakeholders can be anyone who is going to be using the application; the business analysts who did the system requirements collection and analysis, the technical team who designed and developed the application, or the end users who will use the business. We need to interact with various teams to understand and collect as much as information regarding the application architecture and its usage. Following are the different aspects we need to understand with the interviews.

Ø Business Requirements
Business requirements can be collected by meeting and interacting with the business users and their leaders. We need to understand the nature of the application, who will be using the application, how frequently will the users be using the application, how critical is the application, how many users will be concurrently using the application, how complex would be the usage nature of the application like whether there will be simultaneous data input processes ad data analysis process, whether there will be additional employees joining who will be using the application etc.
The business users requirements is the most important factor that has to be understood while doing a capacity planning. However the application is built and maintained, ultimately it is the application usage pattern that determines the performance and availability of the application.

Ø Technical Requirements
While the business requirements of the application can be obtained from the pure business people, the technical requirements should be collected from the technical people that represent the application. The technical requirements that have to be collected include such as database response time, application uptime, the maintenance windows, the expected performance benchmarks, data growth patterns etc. These requirements have to be collected and documented, so that we can size the infrastructure that meets the requirements.

Ø Technical Environment
Collecting information regarding technical environment of the application comes into the picture in business models where we will not be hosting the infrastructure. In such cases, the environment is equally important as the business requirements collection. This is because we need to understand the technical environment where the database application will be going to run. This includes the network infrastructure, the security infrastructure, how efficient is the client’s system support teams are, do they have proper storage infrastructure etc.
We need to understand whether the company normally supports a staging or implementation server along with the production server. In such a case, any change can be applied and tested on the staging environments before being applied on the production server.

§ Collecting Performance Data and Baselines
This is perhaps the most important step in doing a capacity planning. Having the proper baselines in place is one of the main things people tend to forget or ignore when a new application is being deployed. Rather than collecting the baselines, the main focus those times will tend to have the application satisfy all the functional requirements and gets live in the production.
It is important to have the baselines in place because in the absence of a proper set of baselines and workload analysis data, once the application goes live, we will be ignorant on how the application is expected to behave under various workload and environments. Workload analysis can be defined as the process of collecting information on the expected behavior of the application under various workload and environments. If there is a performance issue notified on a production application, the first thing we can do is to take the baseline under that workload and compare the information with the current response. Thus we can easily detect the bottleneck and resolve accordingly.
Workload analysis requires three main components. They are the prototype server infrastructure, the load simulating tools, and the workload monitoring tools. Each of these three components is explained below.

Ø Prototype Server Infrastructure
A prototype infrastructure is required to be in place that exactly matches, with the production environment where the application is planning to be deployed. The details that are required to simulate the production infrastructure have to be obtained from the client during the requirements analysis phase. Accordingly, a proper prototype has to be established on which the workload analysis can be done. We can see that there are numerous occasions where a matching prototype could not be established due to budget constraints. It doesn’t make sense to compare the results obtained from an environment that doesn’t match with the production environment. Any analysis or tuning should be based on matching environments only.
A few factors that have to be considered while matching the environments are the following –
Hardware configuration
Processor power
Memory configuration
Network infrastructure
Storage configuration
Operating Systems
Operating systems version and patch level
Server build
Application implementation
Tier configurations
Database Instance configurations
For well maintained production environments, there will be proper documentation of infrastructure configurations. Any new infrastructure change, or a new device or asset introduction will be based on a predefined plan and documentation. The documentation include such details as what is the production standard for operating system and its edition and version etc, what is the procedure to build and harden any production server, what is the standard hardware that will be used for a typical production server, what will be the storage configuration for a database server etc. These documentations can be used to replicate the environment for the prototype infrastructure.

Ø Workload Simulation
Once the prototype infrastructure is in place, the next step we have to do is to simulate the workload on the application. For a typical multi user database application, we can avail the help of various tools available for workload simulation. A few examples are Loadrunner, Erwin etc.
Simulation tools can be used to simulate various conditions on the applications. These conditions can be applied on the hardware side as well as on the application usage side. Initially, we need to define a fixed number of environments to be simulated. Then we should define the hardware and software conditions that come under any environment. The response of the application and the infrastructure should be measured against each of these simulated environments. These environments and the response to each of these environments make our performance baseline data. Baselines can be defined as performance measurements gathered over a specific periods of time with specific sampling intervals. We need to ensure that we define a large number of possible environments and performance baseline data is collected against each of these varied environments.

Ø Response Monitoring & Analysis
The application response to the various predefined environments should be monitored and documented. The main purpose of having the monitor logs in place is to have an official documentation for current and future references. The logs display the undisputable track record of the application’s response over various environments. Any SLA or other agreements should use these track records as reference to conclude the response capability of the application and the ability to sustain various pressures and adverse environments.
An example of a load response results for an application can be as shown below.
ConcurrentUsers ProcessorUtilization
10,000 40%
20,000 50%
30,000 60%
40,000 70%
50,000 80%
From the figure, we can see easily make out the relation between the number of concurrent users with processor load. The document shows the “fact” that if the total number of users is less that 10,000, we should not expect a processor utilization of 80% unless and until there is something wrong somewhere. This gives the management the strength to commit the client on the response level of the application, as well as give the technical team a baseline reference to find out any processor related application bottlenecks.
There are various tools that can be used for this kind of monitoring purpose. One of the main tools that can be used for this purpose is the in-built Windows Performance Monitor (Perfmon). Perfmon can be used to track the application response to various software and hardware counters. The tracked response levels can be used to project the response and growth trend of the application with respect to time and load, and used as the main tool for post-configuration capacity planning process. The upcoming sessions explain the monitoring and analysis using such counters specific to physical memory, processor subsystem, disk subsystem etc.
Ø Projecting the Trends
Projection is one of the main processes involved in post-configuration capacity planning. This is explained here because it comes as a continuation and makes use of the monitor logs that we obtained from the load simulation environments.
Projection is the process of estimating the response level of the application with the help of already existing load response logs. In the above mentioned example of users vs. processor utilization, we have information on maximum of 50,000 concurrent users. Assume that the client wants and plans to have 10,000 more analysts to use the application. We can create the trend line from the already existing information as follows.

The graph shows us that the additional users will increase the processor subsystem utilization to 90% of the total processor capacity. This information can help the stakeholders to plan and take decisions that help them effectively use the application for their business.

Linear Growth and Geometric Growth
In the above example, we assumed that there will be a direct relationship between the number of concurrent users and the processor utilization. Such a relationship can be termed as a Linear Projection. It may not be always the case. We need to think that there may be increased amount of data loaded with the additional number of users. This means that there may be an exponential relationship between the users and the processor utilization. Consider the following example where we have the data growth for the application with respect to periods.

Period Total Data
200701 100
200702 110
200703 121
200704 133
200705 146
200706 161
200707 177
200708 195
200709 214
200710 236
200711 259
We can see that the data size is increasing with a pattern of d+ d*(10/100), where d is the previous period’s data size. When we need to project the disk requirements for period 200712, we can easily calculate as 259 + (259*.1), and that is approximately 285. Here, the growth is a geometric growth. The projection will be as follows –

§ Final Decision on the Production Server Infrastructure
The final decision on the database server infrastructure is based upon the performance data and baselines we collected in the performance testing phase. When we share the performance test results with the client, the client will share the required parameters that have to be satisfied by the application like number of concurrent users, uptime requirements etc. based upon their requirements, we can finalize whether the hardware we have for the prototype satisfies the requirements. In the above example, assuming that the technical team of the client agrees that the maximum processor at any time can be up to 50%, it is clear that the processor subsystem of the prototype infrastructure can be sufficient if the number of analysts accessing the application is less than 30,000. In case, either the client business team wants their concurrent users more than 30,000 or if the client technical team wants the maximum processor utilization less than 40% or so, we need to reengineer our processor subsystem to increase the strength. Otherwise, we can easily decide that the processor subsystem is sufficient for the time being.
The final decisions can be so easy always. There will be a lot of parameters to be considered. For example, the client may require that at any time the application should have the past two years’ data. In such cases, the load simulating environments should consider the processor overhead that would be caused by the additional data. In short, the simulated environments should be smart enough to satisfy the forecasted environments for the specific period (like two years, three years etc).

Ø Finalizing the SLAs
The SLA is mutual agreement with the client and the solution provided agreeing that the application will satisfy the predefined requirements. The client technical team will insist on such agreements such as 99.99% uptime, a round trip response time of less than 1 second etc. The final decision on the infrastructure should be taken so that the SLAs can be met. Though the SLAs will come as part of the technical requirement specifications, it is possible that business and budget reasons will force to have adjustments on them once the prototype testing is done. Until the SLAs are met on the prototype infrastructure, the application should be fine tuned.

Capacity Planning - Introduction

Capacity planning for a database infrastructure can be defined as the process of assessing the infrastructure requirements for a database environment before deployment, and projecting and predicting the future requirements for the infrastructure even after the environment gets deployed. Capacity planning involves various tests to measure how an environment should and will behave under the various factors that influence the environment, and planning for the infrastructure with the necessary resources that will help to achieve the response in a pre defined manner.

§ Pre-configuration & Post-configuration Capacity Planning
As mentioned above, the entire Capacity Planning process can be broadly categorized to the planning and the subsequent changes that we do before the environment is deployed, and the planning and changes that we do after the environment is deployed and is in production. We can define these as “pre-configuration capacity planning” and “post-configuration capacity planning”. The pre-configuration capacity planning is often referred as “sizing” the database infrastructure. The post-configuration capacity planning is an ongoing cyclic process of monitoring and projecting, planning and modifying. In precise, Capacity Planning for a database infrastructure can be pictorially represented as below.

In each of the upcoming sections, we will be discussing these processes in detail, and how to carry out each process for the various hardware and other requirements.

Capacity Planning - Series Start

I was asked by my department to take a session on how to do the capacity planning process for a database infrastructure. Since, the community includes data services professionals across a lot of platforms and communities, I tried to make the session a very generic one. I will be writing a series of posts to cover what all I presented in the session. I know there will be a few instances where things would be explaned with the help of SQL Server (as that is by far my expertise); forgive me for that :-)

Wednesday, April 16, 2008

Scripting SQL Authentication Logins

Last week, one of my development dba friends informed an issue. We all know that one of the development dba tasks involved in developing a database is to keep the databases at the development, QA, Staging etc environments synchronized. Here, he has the development database to be shipped to the client staging server. The way he does the syncing is scripting all the objects and run them at the destination. (This is something the client insisted on). Anyways, he found an issue; when he scripted the SQL authentication login, the password that came is a cryptic one. So when he runs the script at the destination, he gets a different password! He had not seen the issue earlier, because he had created the login manually in the dev and QA machines. We thought of creating the login manually, but that would make the client uneasy because the client wants everything as a script only. We discussed, and found a solution. There are a couple of Microsoft recommended stored procedures to create logins using a "hashed" password. This is the link to the article - http://support.microsoft.com/kb/918992
Following is the work around we used to recreate the SQL authentication logins. (Assume that the name of the SQL authentication login we are using for the application is “AppLogin”.) We ran the scripts in the article on a SQL Server 2005 machine that has a SQL Authentication login with name “AppLogin” created with the standard password we wanted to use for your production, QA and development instances. The script created two stored procedures - sp_hexadecimal and sp_revlogin. Executed the stored procedure sp_revlogin as follows EXEC SP_HELP_REVLOGIN @LOGIN_NAME = AppLogin This generated the login creation script as follows CREATE LOGIN [AppLogin] WITH PASSWORD = 0x010031D72F18EC95B59D6B3682E46BFD7A6CD70B34F8783ECEBB HASHED, SID = 0x2B94B0D0C3BCCA4DA04C916FA3A4B3FD, DEFAULT_DATABASE = [master], CHECK_POLICY = OFF, CHECK_EXPIRATION = OFF Note the “HASHED” keyword. This shows that the hexadecimal text that is passed is hashed; while creating the login using this script, SQL Server will automatically take the password corresponding to the hashed value. Once we generated the script with the stored procedures, we included the script in our migration package so that a login with name “AppLogin” and the standard password will be created without having the password hard coded. Just note that, the SID for the login is hardcoded. If you want, you can remove the “SID = 0x2B94B0D0C3BCCA4DA04C916FA3A4B3FD” to ensure that there will not be a conflict with another login with the same SID. I am not sure whether any service pack level differences will generate different hashed value; we can probably test this and confirm.

Saturday, April 12, 2008

Dynamic TSQL vs. CASE

One of the major purposes I have seen people using dynamic SQL is to have dynamic sorting operations. This means, the user should have options to sort a dataset using any column he desires and order (either ascending or descending) of his choice. I have seen many developers using EXEC to solve this, and in turn end up solutions that don’t cache and complicated. I will cover the EXEC vs. sp_executesql in another post.

Following is an example how you can use CASE to replace dynamic sql. I send the following script as an example to one of my friends. The intention was to have a CASE solution for a dynamic sort.

use Tempdb
go
create table MyEmp(
EmpFirstName varchar(25), EmpLastName varchar(25), EmpLocation varchar(25)
)
go

insert into MyEmp values ('TOM', 'JOHN', 'COK')
insert into MyEmp values ('MOHIT', 'MATHEW', 'TRV')
insert into MyEmp values ('JERRY', 'EAPEN', 'TRV')
insert into MyEmp values ('PETER', 'DANIEL', 'TRV')
insert into MyEmp values ('BIJU', 'KURIAN', 'BOM')
insert into MyEmp values ('JOHN', 'MATHEW', 'BOM')
insert into MyEmp values ('TOM', 'PAUL', 'BOM')
insert into MyEmp values ('TOM', 'DANIEL', 'MAA')
insert into MyEmp values ('TOM', 'EAPEN', 'MAA')
go

if exists(select name from sys.objects where type = 'P' and name = 'SortTest')
DROP PROC SortTest
go
create procedure SortTest @SortBy varchar(25), @SortDir varchar(4)
as
begin
select EmpFirstName, EmpLastname, EmpLocation from MyEmp
order by
case @SortDir
when 'ASC' then
case @SortBy
when 'EmpFirstName' then EmpFirstName
when 'EmpLastName' then EmpLastName
when 'EmpLocation' then EmpLocation
else EmpFirstName
end
end ASC,
case @SortDir
when 'DESC' then
case @SortBy
when 'EmpFirstName' then EmpFirstName
when 'EmpLastName' then EmpLastName
when 'EmpLocation' then EmpLocation
else EmpFirstName
end
end
DESC
End
go

EXEC SortTest @SortBy = 'EmpLocation', @SortDir = 'DESC'
go

It gave me the result set as below.

MOHIT MATHEW TRV
JERRY EAPEN TRV
PETER DANIEL TRV
TOM DANIEL MAA
TOM EAPEN MAA
TOM JOHN COK
BIJU KURIAN BOM
JOHN MATHEW BOM
TOM PAUL BOM

Now he came back to me saying that in his application he is getting erros as follows –
“Msg 8115, Level 16, State 2, Procedure sspGetExceptionReportListn, Line 40
Arithmetic overflow error converting expression to data type smalldatetime.”
My bad, I didn’t think about the data type. I rewrote the example and send as follows.
create table EmpDetails
(EmpId int, EmpFirstName varchar(25), EmpLastName varchar(25), EmpJoiningDate datetime)
GO
INSERT INTO EmpDetails VALUES(1, 'TOM', 'JOHN', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(2, 'MATHEW', 'JOHN', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(3, 'SYRIAC', 'JOHN', '2007-10-01 00:00:00')
INSERT INTO EmpDetails VALUES(4, 'TOM', 'CHANDY', '2006-07-01 00:00:00')
INSERT INTO EmpDetails VALUES(5, 'IYPE', 'JOHN', '2005-05-01 00:00:00')
INSERT INTO EmpDetails VALUES(6, 'TOM', 'JOSEPH', '2006-09-01 00:00:00')
INSERT INTO EmpDetails VALUES(7, 'LOPUS', 'ALEXANDER', '2001-01-01 00:00:00')
INSERT INTO EmpDetails VALUES(8, 'JIM', 'MATHEW', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(9, 'ANIL', 'MATHEW', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(10, 'JOSEPH', 'THOMAS', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(10, 'ANTONY', 'GOMAS', '2007-08-01 00:00:00')
INSERT INTO EmpDetails VALUES(10, 'CHRISTOPHER', 'THOMAS', '2007-08-01 00:00:00')
GO

IF EXISTS(SELECT NAME FROM SYS.OBJECTS WHERE TYPE = 'P' AND NAME = 'SortEmpDetails')
DROP PROC SortEmpDetails
GO
CREATE PROCEDURE SortEmpDetails @SortColumn varchar(25), @SortDirection varchar(4) AS
SELECT EmpId, EmpFirstName, EmpLastName, EmpJoiningDate FROM EmpDetails
ORDER BY
CASE @SortDirection
WHEN 'DESC' THEN
CASE @SortColumn
WHEN 'EmpId' THEN CAST(EmpId AS INT)
WHEN 'EmpFirstName' THEN EmpFirstName
WHEN 'EmpLastName' THEN EmpFirstName
WHEN 'EmpJoiningDate' THEN CAST(EmpJoiningDate AS DATETIME)
END
END DESC,
CASE @SortDirection
WHEN 'ASC' THEN
CASE @SortColumn
WHEN 'EmpId' THEN CAST(EmpId AS INT)
WHEN 'EmpFirstName' THEN EmpFirstName
WHEN 'EmpLastName' THEN EmpFirstName
WHEN 'EmpJoiningDate' THEN CAST(EmpJoiningDate AS DATETIME)
END
END ASC
GO

EXEC SortEmpDetails @SortColumn = 'EmpJoiningDate', @SortDirection = 'DESC'
I got a result set as follows –
3 SYRIAC JOHN 10/1/07 12:00 AM
1 TOM JOHN 8/1/07 12:00 AM
2 MATHEW JOHN 8/1/07 12:00 AM
8 JIM MATHEW 8/1/07 12:00 AM
9 ANIL MATHEW 8/1/07 12:00 AM
10 JOSEPH THOMAS 8/1/07 12:00 AM
10 ANTONY GOMAS 8/1/07 12:00 AM
10 CHRISTOPHER THOMAS 8/1/07 12:00 AM
6 TOM JOSEPH 9/1/06 12:00 AM
4 TOM CHANDY 7/1/06 12:00 AM
5 IYPE JOHN 5/1/05 12:00 AM
7 LOPUS ALEXANDER 1/1/01 12:00 AM

Yes, the next obvious question was what is the cost involved with the cast we do; yes nothing comes for free. I did a small test on how costly will the cast become. The results I got showed that the cast can be costly if the column that is being casted has a clustered index, the cast operation will override the advantages of a direct index pages read. This is because it scans the index to get the data first, and then does the cast and then does a sort operation to get the final results. The test I did was with a dummy table that has 55,000 records. Following were the readings –

Column - Without Cast - With Cast
Column with Clustered Index - 0.22 milliseconds - 4.18 milliseconds
Column with Non- Clustered Index - 4.17 milliseconds - 4.18 milliseconds

So, this means, casting and sorting for a clustered index reduces performance, while a non clustered index will not be affected much. Anyways, the total performance depends upon how large the table is.

Friday, April 11, 2008

SQL Server 2005 Object Level Permissions

I had been working a few weeks back for one of my friends to extract the object level permissions provided for a bunch of developers. The dba team wanted to have an initial list and then go for a restructuring of the existing security. I wrote a script to help me have this; please find it belwow. Its a much raw one, anyone can suggest modifications or can do the modifications by themselves and reuse it.

/*
SCRIPT TO FIND THE OBJECT LEVEL PERMISSIONS PROVIDED TO SPECIFIC LOGINS.
** THIS DOES NOT INCLUDE THE "ROLE BASED" PERMISSIONS SUCH AS PROVIDING "db_datareader" ROLE FOR A SOPECIFIC LOGIN
** THE GRANTEDBY VALUE WILL ALWAYS RETAIN "dbo" IF THE PERMISSIONS WERE GRANTED BY SOMEONE WHO HAS A "sysadmin" ROLE
** TO BE USED WITH SQL SERVER 2005
Wriitten By: RAJAN JOHN
*/
SET QUOTED_IDENTIFIER ON
GO
SET NOCOUNT ON
GO
SELECT dpm.class_desc as [AccessLevel],
ISNULL(OBJECT_NAME(dpm.major_id), 'N/A')
as [ObjectName], dpr1.name AS [GrantedTo], dpr2.name AS [GrantedBy],
CASE dpm.type
WHEN 'AL' THEN 'ALTER'
WHEN 'ALAK' THEN 'ALTER ANY ASYMMETRIC KEY'
WHEN 'ALAR' THEN 'ALTER ANY APPLICATION ROLE'
WHEN 'ALAS' THEN 'ALTER ANY ASSEMBLY'
WHEN 'ALCF' THEN 'ALTER ANY CERTIFICATE'
WHEN 'ALDS' THEN 'ALTER ANY DATASPACE'
WHEN 'ALED' THEN 'ALTER ANY DATABASE EVENT NOTIFICATION'
WHEN 'ALFT' THEN 'ALTER ANY FULLTEXT CATALOG'
WHEN 'ALMT' THEN 'ALTER ANY MESSAGE TYPE'
WHEN 'ALRL' THEN 'ALTER ANY ROLE'
WHEN 'ALRT' THEN 'ALTER ANY ROUTE'
WHEN 'ALSB' THEN 'ALTER ANY REMOTE SERVICE BINDING'
WHEN 'ALSC' THEN 'ALTER ANY REMOTE SERVICE BINDING'
WHEN 'ALSC' THEN 'ALTER ANY CONTRACT'
WHEN 'ALSK' THEN 'ALTER ANY SYMMETRIC KEY'
WHEN 'ALSM' THEN 'ALTER ANY SCHEMA'
WHEN 'ALSV' THEN 'ALTER ANY SERVICE'
WHEN 'ALTG' THEN 'ALTER ANY DATABASE DDL TRIGGER'
WHEN 'ALUS' THEN 'ALTER ANY USER'
WHEN 'AUTH' THEN 'AUTHENTICATE'
WHEN 'BADB' THEN 'BACKUP DATABASE'
WHEN 'BALO' THEN 'BACKUP LOG'
WHEN 'CL' THEN 'CONTROL'
WHEN 'CO' THEN 'CONNECT'
WHEN 'CORP' THEN 'CONNECT REPLICATION'
WHEN 'CP' THEN 'CHECKPOINT'
WHEN 'CRAG' THEN 'CREATE AGGREGATE'
WHEN 'CRAK' THEN 'CREATE ASYMMETRIC KEY'
WHEN 'CRAS' THEN 'CREATE ASSEMBLY'
WHEN 'CRCF' THEN 'CREATE CERTIFICATE'
WHEN 'CRDB' THEN 'CREATE DATABASE'
WHEN 'CRDF' THEN 'CREATE DEFAULT'
WHEN 'CRED' THEN 'CREATE DATABASE DDL EVENT NOTIFICATION'
WHEN 'CRFN' THEN 'CREATE FUNCTION'
WHEN 'CRFT' THEN 'CREATE FULLTEXT CATALOG'
WHEN 'CRMT' THEN 'CREATE MESSAGE TYPE'
WHEN 'CRPR' THEN 'CREATE PROCEDURE'
WHEN 'CRQU' THEN 'CREATE QUEUE'
WHEN 'CRRL' THEN 'CREATE ROLE'
WHEN 'CRRT' THEN 'CREATE ROUTE'
WHEN 'CRRU' THEN 'CREATE RULE'
WHEN 'CRSB' THEN 'CREATE REMOTE SERVICE BINDING'
WHEN 'CRSC' THEN 'CREATE CONTRACT'
WHEN 'CRSK' THEN 'CREATE SYMMETRIC KEY'
WHEN 'CRSM' THEN 'CREATE SCHEMA'
WHEN 'CRSN' THEN 'CREATE SYNONYM'
WHEN 'CRSV' THEN 'CREATE SERVICE'
WHEN 'CRTB' THEN 'CREATE TABLE'
WHEN 'CRTY' THEN 'CREATE TYPE'
WHEN 'CRVW' THEN 'CREATE VIEW'
WHEN 'CRXS' THEN 'CREATE XML SCHEMA COLLECTION'
WHEN 'DL' THEN 'DELETE'
WHEN 'EX' THEN 'EXECUTE'
WHEN 'IM' THEN 'IMPERSONATE'
WHEN 'IN' THEN 'INSERT'
WHEN 'RC' THEN 'RECEIVE'
WHEN 'RF' THEN 'REFERENCES'
WHEN 'SL' THEN 'SELECT'
WHEN 'SN' THEN 'SEND'
WHEN 'SPLN' THEN 'SHOWPLAN'
WHEN 'SUQN' THEN 'SUBSCRIBE QUERY NOTIFICATIONS'
WHEN 'TO' THEN 'TAKE OWNERSHIP'
WHEN 'UP' THEN 'UPDATE'
WHEN 'VW' THEN 'VIEW DEFINITION'
WHEN 'VWDS' THEN 'VIEW DATABASE STATE'
END
FROM sys.database_permissions dpm INNER JOIN sys.database_principals dpr1
ON dpm.grantee_principal_id = dpr1.principal_id INNER JOIN sys.database_principals dpr2
ON dpm.grantor_principal_id = dpr2.principal_id
WHERE dpr1.principal_id NOT IN (0,1,2,3,4,16384,16385,16386,16387,16388,16389,16390,16391,16392,16393)

Thursday, April 10, 2008

Restoring Reporting Services 2005 Databases

We were having applications that use Reporting Services 2005 as the core Report engine. Though in the initial application development phase the Reporting Services databases backup/restore processes werent an issue, later when we started deploying the application for various clients, it became obvious that we needed a formal procedure to do the moving of the databases between different environments without breaking anything. I wrote a procedure to facilitate this, and I am sharing this for your help.

Reporting Services 2005 uses two databases for storing its information and as workspace to process the reports. For a default installation they will be named “ReportServer” and “ReportServerTempDB”. It becomes necessary at times to copy the databases to recreate the Report Engine of an application on another instance of Reporting Services, mostly for testing purposes and analysis purposes. The following procedure explains the steps to be followed while copying the Reporting Services databases from one instance of Reporting services to another.

IMPORTANT: Ensure that both the source and Destination Reporting Services instances are of the same version and editions before proceeding with the changes.
Execute the following steps in the order to perform a Reporting Services databases from one instance to another.

§ Backup the Databases – Source Instance
1. Open “SQL Server Management Studio” from “Start” à “Programs” à “Microsoft SQL Server 2005” à “SQL Server Management Studio” and connect the Database Engine with the appropriate credentials.
2. Collapse the “Databases” node and select the “ReportServer” database. Click the right button of the mouse and select “Tasks” à “Backup”. This will open the “Backup Database” tool.
3. In the “General” page do the following:
· In the “Source” section, select the Database as “ReportServer” and Backup type as “Full”. · In the “Backup Set” section, ensure that the “Backup Set will expire” is set to “0 Days”.
· In the “Destination” section, select the radio button for “disk” and click the “Add” button. Browse the path to “E:\SigTools” (or any other standard backups location) and set the file name to “ReportServer_”, where is the current date.
4. Click the “OK” button to create the backup set.
5. Repeat Step 2 to 4 to backup “ReportServerTempDB” and save the backup set as “ReportServerTempDB_”, where is the current date.
§ Backup the Encryption Key – Source Instance
Follow the below procedure to backup the encryption key that connects the Report Server with the Reporting Services databases.
1. Open “Reporting Services Configuration” tool from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration” and connect to the appropriate Reporting Services instance.
2. Select the “Encryption Keys” link to open the “Encryption Key” page and do the following.
· Click the “Backup” button to open the “Encryption Key Information” box.
· In the “Password” box, put the standard complex password and in the “Key File” box, browse to “E:\SigTools” (or any other standard backups location) and set the file name to “EncKey_”, where is the current date.
· Click the “OK” button to backup the key. Ensure that the “Task Status” pane shows no error messages.

§ Copying the Backups to the Destination Instance
Physically copy the backup sets to the destination instance so as so make them available for the destination instance. If the backups were taken to a network shared location where the source and destination instances have access to, this step can be ignored.

§ Backup the Databases – Destination Instance
1. Open “SQL Server Management Studio” from “Start” à “Programs” à “Microsoft SQL Server 2005” à “SQL Server Management Studio” and connect the Database Engine with the appropriate credentials.
2. Collapse the “Databases” node and select the “ReportServer” database. Click the right button of the mouse and select “Tasks” à “Backup”. This will open the “Backup Database” tool. In the “General” page do the following:
· In the “Source” section, select the Database as “ReportServer” and Backup type as “Full”. · In the “Backup Set” section, ensure that the “Backup Set will expire” is set to “0 Days”.
· In the “Destination” section, select the radio button for “disk” and click the “Add” button. Browse the path to “E:\SigTools” (or any other standard backups location) and set the file name to “ReportServer_”, where is the current date.
· Click the “OK” button to create the backup set.
3. Repeat Step 3 and 4 to backup “ReportServerTempDB” to and save the backup set as “ReportServerTempDB_”, where is the current date.

§ Backup the Encryption Key – Destination Instance
Follow the below procedure to backup the encryption key that connects the Report server with the Reporting Services databases.
1. Open “Reporting Services Configuration” tool from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration” and connect to the appropriate Reporting Services instance.
2. Select the “Encryption Keys” link to open the “Encryption Key” page and do the following.
· Click the “Backup” button to open the “Encryption Key Information” box.
· In the “Password” box, put the standard complex password and in the “Key File” box, browse to “E:\SigTools” (or any other standard backups location) and set the file name to “EncKey_”, where is the current date.
· Click the “OK” button to backup the key. Ensure that the “Task Status” pane shows no errors.

§ Stopping IIS and Report Server Services
1. Open “Services” mmc console from “Start” à “Programs” à “Administrative Tools” à “Services”.
2. Select the “IISAdmin” service and stop the service. This will open a dialog box asking for stopping all IIS dependant services like “WWW Publishing”, “SMTP”, “FTP Publishing” etc. Click “Yes” to continue. This stops IIS and its dependant services.
3. Open “Reporting Services Configuration” tool from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration”. On the “Server Status” page, click the “Stop” button to stop Report Server.

§ Restoring the Databases – Destination Instance
1. Open “SQL Server Management Studio” from “Start” à “Programs” à “Microsoft SQL Server 2005” à “SQL Server Management Studio” and connect the Database Engine with the appropriate credentials.
2. Collapse the “Databases” node and select the “ReportServer” database. Click the right button of the mouse and select “Tasks” à “Restore” à “Database”. This will open the “Restore Database” tool.
3. Select the “General” page do the following:
· In the “Destination for restore” section, select the Database as “ReportServer”.
· In the “Source for Restore” section, select “From Device” and click the browse button. Select “File” as the “Backup Media Type” and click the “Add” button. Select the backup set for ReportServer database click the “OK” button.
· In the “Backup Sets to Restore” section, select the appropriate backup set.
4. Select the “Options” page, do the following
· In the “Restore Options” section, ensure that the correct file path is provided for the data files and log files.
5. Click the “OK” button to start the database restore process.
6. Ensure that the restore completed successfully.
7. Perform steps 2 to 6 to restore the ReportServerTempDB database.

§ Starting IIS and Report Server Services
1. Open “Services” mmc console from “Start” à “Programs” à “Administrative Tools” à “Services”.
2. Select the “IISAdmin” service and start the service. Start all the IIS dependant services like “WWW Publishing”, “SMTP”, “FTP Publishing” etc which were stopped following the “IIS Admin” service stoppage.
3. Open “Reporting Services Configuration” tool from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration”. On the “Server Status” page, click the “Start” button to start Report Server.

§ Verifying the Reporting Services Configurations
Follow the below procedure to verify that the Reporting Services configurations on both the source and destination instances are matching.
a. Open “Reporting Services Configuration” tool for the source and destination instances from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration” and connect to the appropriate Reporting Services instances.
2. Verify that the configurations are exactly matching for both the instances. This includes the following.
· Report Server virtual directory
· Report Manager virtual directory
· Windows Service Identity
· Web Service Identity
3. Click the “Apply” button to save the changes.

§ Restoring the Encryption Key – Destination Instance
Follow the below procedure to restore the Encryption Key backed up from the source server.
1. Open “Reporting Services Configuration” tool for the destination instances from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “Reporting Services Configuration” and connect to the appropriate Reporting Services instance.
2. Open the “Encryption Keys” page and click the “Delete” button to delete the already existing Encryption Key on the destination.
NOTE: Ensure that you have backed up the Encryption Key before you any deletions.
3. Click the “Restore” button to open up the “Encryption Key Information” box. Provide the password and click the browse button to locate the key file. Click the “OK” button to do the restoration.
4. Ensure that there are no error messages in the “Task Status” pane.
5. Open the “SQL Server Configuration Manager” from “Start” à “Programs” à “Microsoft SQL Server 2005” à “Configuration Tools” à “SQL Server Configuration Manager” and restart the service “SQL Server Reporting Services”

IMPORTANT: It is recommended that the destination server is rebooted after the changes.

Using Database Mail in SQL Server 2005

SQL Server 2005 database mail is an easy to use feature that can be called by any application or code to send notification emails. This can be used by DBAs to monitor jobs and batches where there are hundreds of database servers and instances which have large number of periodic scheduled jobs.
SQL Server Database Mail is used to send emails from the SQL Server engine to a set of recipients. We can use the sp_send_dbmail stored procedure in the msdb database to send an SMTP notification emails.

Using the sp_send_dbmail Stored Procedure
The stored procedure msdb.dbo.sp_send_dbmail can be called from the application whenever a notification email has to be send. The only prerequisite required is that the SQL Server instance through which the database email is being sent should have an email profile created. This profile contains the required information such as sender email address, reply address, SMTP server name, SMTP port number etc. The section – Configuring the Email Profile – explains how to configure an email profile.
Following is the syntax for the sp_send_dbmail –
Execute msdb.dbo.sp_send_dbmail
@profile_name = '',
@recipients = ‘’,
@subject = ‘’,
@body = ‘’,
@importance = ‘’,
@body_format = ''

@profile_name – This should be the profile name to be used
@recipients – This should be the list of email addresses delimited by semi colon.
@subject – This should be the subject of the email.
@body – This should be the email body
@importance – This should be either ‘low’, ‘normal’ or ‘high’
@body_format – This should be either ‘TEXT’ or ‘HTML’

Example –
Execute msdb.dbo.sp_send_dbmail
@profile_name = 'Database Mail Profile',
@recipients = ‘abc@myComp.com; xyz@myComp.com’,
@subject = ‘Backup Completed’,
@body = ‘AppData database backed up successfully!’,
@importance = ‘high’,
@body_format = 'TEXT'
You can dynamically assign values to predefined variables and send the variable values to the parameters as well.
Configuring the Email Profile
Follow the below given procedure to configure SQL Server 2005 database mail.
§ Creating the Database Mail Profile
· Connect the SQL Server instance for which Database Mail has to be configured using SQL Server Management Studio.
· Open the “Database Mail Configuration Wizard” from “Management” à “Database Mail” à “Configure Database Mail”.
· If you are using Database Mail for the first time, select “Setup Database Mail” and click the “Next” button to proceed.
· In the “Ne Profile” window, do the following.
o In the “Profile Name” textbox, give the appropriate profile name. E.g. “TCO Database Profile”
o In the description textbox, give purpose of the profile you are creating.
o Under the “SMTP Accounts” section, click the “Add” button to add a new account. This will open up the window “Add a new Account to TCO Database Profile”. Click the “New Account” button.
1. In the “Account Name” textbox, enter the appropriate account name. E.g. “TCO Notification”
2. Give the purpose of the account in the “Description” textbox.
3. In the “Email Address” textbox, chose an appropriate email address. The database email will have sender as this email address. This address need NOT be a valid email address. E.g. TcoNotification@myComp.com.
4. In the “Display Name” section, enter any name that should be displayed as the sender name. E.g. “TCO Notification”
5. In the “Reply Email” section, enter any valid email address if you require that a reply email address is required.
6. In the “Server Name” section, enter the accurate name of the SMTP server through which the emails can be sent. E.g. “EmailServer-1.myComp.com”
7. In the “SMTP Authentication” section, select the appropriate authentication. By default, chose the “Anonymous Authentication”.
8. Click the “OK” button to save the changes. This will bring you back to “Database Mail Configuration” wizard.
NOTE: Contact your Network Administrator for the details on SMTP server details and the authentication information.
o Once the account has been created and configured, click the “Next” button to proceed.
o In the “Profile Security” section, select whether you want the profile to be a “Private Profile” or a “Public Profile”. A private profile can be use by only a single user, while a public profile can be used by any user. Click the appropriate security, and click the “Next” button to proceed.
o In the “Configure System Parameters” section, ensure that the parameter values are appropriate, else make any required change. Click the “Next” button to proceed.
o In the “Complete the Wizard” section, go through the various changes you have made. Click the “Finish” button to complete the configuration.
o Click the “Close” button to exit the wizard.

· You should test the profile you created to ensure the correctness. Follow the steps below to send a test mail.
o Open the “Send Test Mail” window from “Management” à “Database Mail” à “Send Test Mail”. o Select the profile created and enter your valid email address.
o Click the “Send Test Mail” button to send the test mail.
o Click the “OK” button to exit the window.
o Verify that you received the test mail. If you did not receive the test mail, review the configuration and try again.

Data Services