I have a table in cdk that is created and populated from s3, something like:
const bucket = new s3.Bucket(this, 'ImportBucket', {
bucketName: 'ddb-data-for-import'
});
bucket.addToResourcePolicy(
new PolicyStatement({
actions: [
's3:AbortMultipartUpload',
's3:PutObject',
's3:PutObjectAcl'
],
resources: [`${bucket.bucketArn}/*`],
principals: [new ArnPrincipal('arn:xxx')],
}));
new ddb.Table(this, 'ImportedDdb', {
tableName: `imported`,
billingMode: ddb.BillingMode.PAY_PER_REQUEST,
importSource: {
bucket,
keyPrefix: `AWSDynamoDB/123123123-321312321/data`,
inputFormat: ddb.InputFormat.dynamoDBJson(),
compressionType: ddb.InputCompressionType.GZIP,
},
partitionKey: { name: 'pk', type: ddb.AttributeType.STRING },
sortKey: { name: 'sk', type: ddb.AttributeType.STRING },
});
It works, and the data is imported, but after the initial run, I see no point in keeping the bucket alive with GBs of data that is going to get stale one minute later. So I thought that I could remove the importSource
property of the table and It would let me redeploy without a problem, but Cloudformation tries to recreate the table, causing data loss.
Is there a way to clean up this resource? Should I? I don't see a point in keeping the bucket since if anything happens and the import ought to be rerun, I'd have to export a newer snapshot anyway
I have a table in cdk that is created and populated from s3, something like:
const bucket = new s3.Bucket(this, 'ImportBucket', {
bucketName: 'ddb-data-for-import'
});
bucket.addToResourcePolicy(
new PolicyStatement({
actions: [
's3:AbortMultipartUpload',
's3:PutObject',
's3:PutObjectAcl'
],
resources: [`${bucket.bucketArn}/*`],
principals: [new ArnPrincipal('arn:xxx')],
}));
new ddb.Table(this, 'ImportedDdb', {
tableName: `imported`,
billingMode: ddb.BillingMode.PAY_PER_REQUEST,
importSource: {
bucket,
keyPrefix: `AWSDynamoDB/123123123-321312321/data`,
inputFormat: ddb.InputFormat.dynamoDBJson(),
compressionType: ddb.InputCompressionType.GZIP,
},
partitionKey: { name: 'pk', type: ddb.AttributeType.STRING },
sortKey: { name: 'sk', type: ddb.AttributeType.STRING },
});
It works, and the data is imported, but after the initial run, I see no point in keeping the bucket alive with GBs of data that is going to get stale one minute later. So I thought that I could remove the importSource
property of the table and It would let me redeploy without a problem, but Cloudformation tries to recreate the table, causing data loss.
Is there a way to clean up this resource? Should I? I don't see a point in keeping the bucket since if anything happens and the import ought to be rerun, I'd have to export a newer snapshot anyway
First you should add a retain policy and then deploy, then try to remove the import source, that will keep the table intact.
removalPolicy: RemovalPolicy.RETAIN
I'm not well versed in the importSource
as it wasn't me who created it. But I believe the removalPolicy should allow you to achieve what you like.
https://docs.aws.amazon.com/cdk/api/v1/docs/@aws-cdk_core.RemovalPolicy.html