Very often while exchanging data files with clients, a useful information is stored in the file name. This can be a client name prefix, date, time, the period that filtered the data in the file and so on.
While loading the data into the database, we need to extract the portions of the filename, store them into variables and use these variables during the data load.
Let’s say the client uploads their files in a folder D:\Temp.
The file name is delimited by underscore (_) and is predefined as:
- VariableA
- <delimiter>
- VariableB
- <delimiter>
- VariableC
The client uploads the files:
- VariableA_VariableB_VariableC.csv
- VariableAA_VariableBB_VariableCC.csv
- VariableAAA_VariableBBB_VariableCCC.csv
- VariableAAAA_VariableBBBB_VariableCCCC.csv
- VariableAAAAAA_VariableBBBBB_VariableCCCCC.csv
Create SSIS project (I use Visual Studio 2008 in this example).
First rename the package.
In Control Flow add Foreach Loop Container. As this example is relatively simple I will use abbreviations for the objects. Foreach Loop Container will be FLC.
Select FLC and add 2 variables in its scope:
- FolderPath (String) – to store the path of the client’s folder and change it only once when needed
- FileName (String) – to store the filename while looping the FLC.
Double click FLC and:
- Add FolderPath variable to Directory expression
- Point to .csv files only
- Read the file name only
Map the FileName variable:
We can insert the values of the variables, stored in the filename into SQL log table.
Create database.
USE [master]; GO IF (DB_ID('PL') IS NOT NULL) BEGIN ALTER DATABASE PL SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE PL; END GO CREATE DATABASE PL; GO USE PL; GO
Create the log table.
IF (OBJECT_ID('PL.dbo.LogTable', 'U') IS NOT NULL) BEGIN DROP TABLE PL.dbo.LogTable; END CREATE TABLE PL.dbo.LogTable ( ID INT IDENTITY(1, 1) PRIMARY KEY , FolderPath VARCHAR(128) , [FileName] VARCHAR(128) , VariableA VARCHAR(128) , VariableB VARCHAR(128) , VariableC VARCHAR(128) , LoadDateTime DATETIME DEFAULT(GETDATE()) ); GO
Add the variables that will store the extracted values from the file name.
Click the FLC to add the variables in its scope.
- VariableA (String)
- VariableB (String)
- VariableC (String)
Add Execute SQL Task (EST) inside the FLC:
Double click EST and:
- Set the SQL connection
- Write the SQL statement to insert the data
INSERT PL.dbo.LogTable ( [FileName] , FolderPath , VariableA , VariableB , VariableC ) SELECT ?, ?, ?, ?, ?;

4. Map the variables, collected from the file name.
We need to assign the values from the different parts of the file name into variables and manipulate them in the loop.
Let’s look at the first filename –> VariableA_VariableB_VariableC
We need to cut the string from position 1 to 9 and assign its value into VariableA variable.
We add an expression.
and we use SUBSTRING() and FINDSTRING() built-in functions.
Putting the arguments into
SUBSTRING(String, StartPosition, CharactersCount)
ends to
SUBSTRING("VariableA_VariableB_VariableC", 1, 9)
and the result is VariableA
, but… we need a flexible CharacterCount, based on the position of the first underscore (_).
To find it, we use
FINDSTRING(String, FindWhat, WhichOccurence)
and we can invoke it in our example like
FINDSTRING("VariableA_VariableB_VariableC", "_", 1)
or “what is the position of the first occurrence of underscore (_)?”.
The result is 10.
As we need 9 as CharactersCount, we add “-1” to FINDSTRING():
FINDSTRING("VariableA_VariableB_VariableC", "_", 1) - 1
Putting both functions together
SUBSTRING("VariableA_VariableB_VariableC ", 1, (FINDSTRING("VariableA_VariableB_VariableC", "_", 1) - 1))
ends to VariableA.
Test the 5th file VariableAAAAAA_VariableBBBBB_VariableCCCCC
SUBSTRING("VariableAAAAAA_VariableBBBBB_VariableCCCCC", 1, (FINDSTRING("VariableAAAAAA_VariableBBBBB_VariableCCCCC", "_", 1) - 1))
ends to VariableAAAAAA.
Replace the hard-coded string with the first variable and test
SUBSTRING(@[User::FileName], 1, (FINDSTRING(@[User::FileName], "_", 1) - 1))
Ups! An error occurs
Here is the tricky part – add ABS() to convert the negative value to positive
SUBSTRING(@[User::FileName], 1, ABS((FINDSTRING(@[User::FileName], "_", 1) - 1)))
Map the variables in EST.
Before the test, we need to create the test files in D:\Temp
Run the package by clicking F5 in SSIS.
Check the Log table.
SELECT * FROM PL.dbo.LogTable; GO TRUNCATE TABLE PL.dbo.LogTable; GO

Expression for VariableB:
SUBSTRING( @[User::FileName] , (FINDSTRING(@[User::FileName], "_", 1) + 1) , ABS(FINDSTRING(@[User::FileName], "_", 2) - (FINDSTRING(@[User::FileName], "_", 1) + 1)) )
Expression for VariableC:
SUBSTRING( @[User::FileName] , (FINDSTRING(@[User::FileName], "_", 2) + 1) , ABS(FINDSTRING(@[User::FileName], "_", 3) - (FINDSTRING(@[User::FileName], "_", 2) + 1)) )

Delete database.
USE [master]; GO IF (DB_ID('PL') IS NOT NULL) BEGIN ALTER DATABASE PL SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE PL; END GO
Keep it simple :-)