An ETL that i build recently instigated me to share the following excerpt of python code. The players in this ETL are: Apache Kafka (Source) Azure Data Factory (ETL app) Azure Databricks (Extract and Transform with Python) Azure Data Lake Storage (File storage) Cosmos DB (Destination) In this example i […]
ETL
In this example we create ADF pipeline that extracts from SQL Server and saves in CSV files in Data Lake. The point is just to demonstrate the logic so you can edit it as you need. The extract is driven by SQL Server table with all the parameters for a […]
Create table and insert data: Databricks commands: The values for hostname, port, database and username can be found in the connection string of the database. Create a widget to pass parameters from Azure Data Factory. As many as needed parameters can be declared and used in a dynamic SQL query. […]
SSISDB is the database that is collaborating with SSIS. When we execute a package, the execution information is logged in SSISDB. If a job is scheduled, we need to know if there are errors in the executions. As a last step in SSIS package, I run stored procedure that searches […]
In the last posts we played with Change Tracking (CT): T-SQL: Change Tracking – What is this and how it works T-SQL: Change Tracking Retention Period – How Long the Changes are Kept? Let’s execute few queries and see what do we have after the retention period expires.
As we mentioned in Change Tracking – What is this and how it works, the retention period is the one that determines how long the changes are kept.
Change tracking was introduced in SQL Server 2008. This is a tool that keeps the fact that there are changes in tracked tables for a specified period of time. It doesn’t show what the changes are, it tells that a data has been changed. It answers if a change has been […]
While working on ETL, I build a list of tables and their corresponding columns that need to be created on the destination server. In this case the square brackets around SchemaName and TableName are mandatory. I started adding the square brackets with the built-in function QUOTENAME(), but soon I got […]
Very often while exchanging data files with clients, a useful information is stored in the file name. This can be a client name prefix, date, time, the period that filtered the data in the file and so on. While loading the data into the database, we need to extract the […]
Data Warehouse is “a system used for reporting and data analysis”. In general it is built by dim (dimension) and fact tables (structure and data tables). Here I create a stored procedure to build Date Dimension. Now it builds only calendar data and will be expanded to include “fiscal” data […]