Monday, September 8, 2014

2. Sampling is to select a random subset of given set of data. Suppose we have 1000 rows in our table. And we have to select any 10 rows randomly, then we use sampling transformation.
                In SSIS, we can do sampling in 2 ways i.e.  Row Sampling and Percentage Sampling.
In this BOK, we are going to see how to use Percentage Sampling transformation in SSIS Packaging. Percentage sampling is used to randomly select some rows and move it as output to the required process as and when required to divide the rows. Example of using this transformation is say example if we want to select some randomly 20% users of a community for a random prize then we can use this transformation. In this process we are going to see an example on how to use this process for the selected output values.
 
Steps:
First select the right project to work on integration services project. Now once the projects is opened drag and drop a source, a destination and a percentage sampling task as shown in the screen below

We can see some red marks on each task which indicates that the tasks are not configured. We need to configure each task so that while execution we can have a smooth process. Now let’s configure each and every task to execute the package. First let us start with the OLEDB Source as shown in the screen below –
 
 
 
Now we are done with the source, we need to configure the percentage sampling task now. To do that double click on the task will open the window as shown in the screen below. Here we need to specify the percentage of rows to be affected in this transformation and to proceed further. In our sample we are going to select as 40 as shown in the screen below – 
 
 
 


 
 
 
 
 
 
 
 
 
 
 
Now we are done with the Percentage sampling task, we need to configure the destination section where the results are expected. To do that drag and drop the green arrow to the destination task which we created earlier. It will open a configuration window to select the output name from the percentage sampling task as shown in the screen below
 
Now we need to configure the destination excel as shown in the screens below which is self-explanatory.
 
 
Now we are ready with our package. We need to build and execute it to see the desired result. So our screen will look like below
 

 
Now to build and execute press F5 and we can see the result window as shown in the screen below
 

 
Here you can see that out of 29 rows, 40% rows that are approximate to 12 rows randomly have been sent to excel destination.
 
We can see the number of rows affected and used across. To see the result in the excel navigate to the path where we configured our destination and open the excel, we can see the result as shown in the screen below
 
 
Conclusion: So in this article we have seen how to use the Percentage Sampling to execute data set and split based on the percent and uses it across the requirement

No comments:

Post a Comment