Skip to content Skip to sidebar Skip to footer

Triggering AWS Lambda On Arrival Of New Files In AWS S3

I have a Lambda function written in Python, which has the code to run Redshift copy commands for 3 tables from 3 files located in AWS S3. Example: I have table A, B and C. The pyth

Solution 1:

When an Amazon S3 Event triggers an AWS Lambda function, it provides the Bucket name and Object key as part of the event:

def lambda_handler(event, context):

  # Get the bucket and object key from the Event
  bucket = event['Records'][0]['s3']['bucket']['name']
  key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])

While the object details as passed as a list, I suspect that each event is only ever supplied with one object (hence the use of [0]). However, I'm not 100% certain that this will always be the case. Best to assume it until proven otherwise.

Thus, if your code is expecting specific objects, your code would be:

if key == 'abc/A.csv':
    'copy to Table-A from "s3://bucket/abc/A.csv"'
if key == 'abc/B.csv':
    'copy to Table-B from "s3://bucket/abc/B.csv"'
if key == 'abc/C.csv':
    'copy to Table-C from "s3://bucket/abc/C.csv"'

There is no need to store LastModified, since the event is triggered whenever a new file is uploaded. Also, be careful about storing data in a global dict and expecting it to be around at a future execution — this will not always be the case. A Lambda container can be removed if it does not run for a period of time, and additional Lambda containers might be created if there is concurrent execution.

If you always know that you are expecting 3 files and they are always uploaded in a certain order, then you could instead use the upload of the 3rd file to trigger the process, which would then copy all 3 files to Redshift.


Post a Comment for "Triggering AWS Lambda On Arrival Of New Files In AWS S3"