I am trying to use Metadata activity such a way that I want to be able to get all the files based on the last_modified_timestamp param and based on this output, I should be able to copy the data to ADLS.
this below is the directory structure in SFTP server. there are files in rootpath itself, and bunch of subdirectories I want to capture all the csv files,zipfiles, anything that's File Format.
rootpath/
test.csv
test1.csv
rootpath/
dir1/
test2.csv
test3.zip
rootpath/
dir2/
dir3/
test4.csv
test5.csv
test6.zip
I've tried Wildcard Path to capture all the possible files starting from the rootpath in dataset I defined.
rootpath/**/**/**
However this did not give output for files that are in rootpath that is test.csv, test1.csv
Is there a dynamic way to do this and efficient? I saw some post regarding using forEach activity, but seems to be an overkill..
my desired output sample is below.
"childItems": [
{
"name": "test.csv",
"type": "File"
},
{
"name": "test1.csv",
"type": "File"
},
{
"name": "test2.csv",
"type": "File"
},
{
"name": "test3.zip",
"type": "File"
},
...
{
"name": "test6.zip",
"type": "File"
},
I am trying to use Metadata activity such a way that I want to be able to get all the files based on the last_modified_timestamp param and based on this output, I should be able to copy the data to ADLS.
this below is the directory structure in SFTP server. there are files in rootpath itself, and bunch of subdirectories I want to capture all the csv files,zipfiles, anything that's File Format.
rootpath/
test.csv
test1.csv
rootpath/
dir1/
test2.csv
test3.zip
rootpath/
dir2/
dir3/
test4.csv
test5.csv
test6.zip
I've tried Wildcard Path to capture all the possible files starting from the rootpath in dataset I defined.
rootpath/**/**/**
However this did not give output for files that are in rootpath that is test.csv, test1.csv
Is there a dynamic way to do this and efficient? I saw some post regarding using forEach activity, but seems to be an overkill..
my desired output sample is below.
"childItems": [
{
"name": "test.csv",
"type": "File"
},
{
"name": "test1.csv",
"type": "File"
},
{
"name": "test2.csv",
"type": "File"
},
{
"name": "test3.zip",
"type": "File"
},
...
{
"name": "test6.zip",
"type": "File"
},
In azure data factory if you want all the files contained at any level of a nested a folder subtree Recursive metadata is not possible. The output of getmetadata includes only files in the specified path to get nested values you need to iterate on output directories of getmetadata values. using foreach loop this way you can go up to 1 level deep.
@Richard Swinbank here Get Metadata recursively in Azure Data Factory document discussed the Workaround for same situation using until loop and some variables.
And then you can use these paths to get last_modified_timestamp
further using get metadata activity.