An AWS Professional Service open source initiative

Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL). Quick Start Installation command: pip install awswrangler For platforms without PyArrow 3 support (e.g. EMR, Glue PySpark Job, MWAA):pip install pyarrow==2 awswrangler import awswrangler as wr import pandas as pd from datetime import datetime df = pd.DataFrame({“id”: [1, 2], “value”: [“foo”, “boo”]}) # Storing data on Data Lake wr.s3.to_parquet( df=df, path=”s3://bucket/dataset/”, dataset=True, database=”my_db”, table=”my_table” […]

Read more

Call any existing AWS Lambda Function you have in a future time

aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in the future. This functionality is achieved by dynamically managing the EventBridge Rules. aws-lambda-scheduler also has optimizations you can configure and extend yourself. AWS allows maximum of 300 EventBridge rules in a region. If you are expecting to create more than 300 rules, check out Optimizations section below. Example Usage When you set up the aws-lambda-scheduler in your AWS environment, you can simply call it with a json data […]

Read more

AWS Data Engineering Pipeline with python

AWS Data Engineering Pipeline This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda): Below are the steps of how to build this pipeline in AWS: 1️⃣ Create a new iCloud9 environment dedicated to this project. 🤔 Need a refresher? Please check this repo. ⚠️ Make sure to use name as your unique id for your items in the fang table. […]

Read more

A python SDK for interacting with quantum devices via AWS

Amazon Braket Python SDK The Amazon Braket Python SDK is an open source library that provides a framework that you can use to interact with quantum computing hardware devices through Amazon Braket. Prerequisites Before you begin working with the Amazon Braket SDK, make sure that you’ve installed or configured the following prerequisites. Python 3.7.2 or greater Download and install Python 3.7.2 or greater from Python.org. Git Install Git from https://git-scm.com/downloads. Installation instructions are provided on the download page. IAM user […]

Read more

EKS CDK Quick Start in Python

quickstart-eks-cdk-python This Quick Start is a reference architecture and implementation of how you can use the Cloud Development Kit (CDK) to orchestrate the Elastic Kubernetes Serivce (EKS) to quickly deploy a more complete and “production ready” Kubernetes environment on AWS. What does this Quick Start create for you: An appropriate VPC (/22 CDIR w/1024 IPs by default – though you can edit this in cluster-bootstrap/cdk.json) with public and private subnets across three availability zones. Alternatively, just flip create_new_vpc to False […]

Read more

Scans Amazon Route53 across an AWS Organization for domain records vulnerable to takeover

domain-protect scans Amazon Route53 across an AWS Organization for domain records vulnerable to takeover deploy to security audit account deploy to security audit account scan your entire AWS Organization receive alerts by Slack or email or manually scan from your laptop subdomain detection functionality scans Amazon Route53 Alias records to identify CloudFront distributions with missing S3 origin scans Amazon Route53 CNAME records to identify CloudFront distributions with missing S3 origin scans Amazon Route53 for ElasticBeanstalk Alias records vulnerable to takeover […]

Read more

A simple tool to check if an IP/hostname belongs to the AWS IP space or not

onaws onaws is a simple tool to check if an IP/hostname belongs to the AWS IP space or not. It uses the AWS IP address ranges data published by AWS to perform the search. Continuous recon of assets Gathering assets using a specific service (e.g. EC2) Finding region information for S3 buckets … etc. Install pip install onaws Usage Given an IP: onaws 52.219.47.34 Given a hostname: A domain or subdomain can be passed as input: onaws example.com You may […]

Read more

Create a Neo4J graph of users and roles trust policies within an AWS Organization

AWS_ORG_MAPPER This tool uses sso-oidc to authenticate to the AWS organization. Once authenticated the tool will attempt to enumerate all users and roles in the organization and map their trust relations. The graph can be explored using Neo4j desktop or web client. Below you can find some sample queries that can help extract useful information from the graph. Using this tool users can discover how role trusts are delegated in the organization and can help identify improve account isolation within […]

Read more

The elegance of Airflow + the power of AWS

Orkestra The elegance of Airflow + the power of AWS. examples/hello_orkestra.py import random from typing import * from uuid import uuid4 from aws_lambda_powertools import Logger, Tracer from pydantic import BaseModel from orkestra import compose from orkestra.interfaces import Duration def dag(): ( generate_item >> add_price >> copy_item >> double_price >> (do_nothing, assert_false) >> say_hello >> [random_int, random_float] >> say_goodbye ) class Item(BaseModel): id: str name: str price: Optional[float] = None @classmethod def random(cls): return cls( id=str(uuid4()), name=random.choice( [ “potato”, “moon rock”, […]

Read more
1 2