Technology Stack

Database

Elasticsearch (Search engine, querying, visualizations, recommendation engine, etc) Redis (In-memory database, speeding up the processing) SQL

Neo4j (For relational graphs)

For Queue

Redis, SQS, Rabbitmq

Streaming

Kafka

AWS Cloud

SQS Queue S3 Lambda Function

Python

FastAPI (gunicorn, swagger, ReDoc and other features)

Flask (in production - with concurrency) Celery (Parallel Processing) Airflow (For Scheduling - Cron Job)

Selenium driver (For automation and scrapping)

Data Science (Pandas, NumPy) Natural Language Processing (nltk, spacy)

Image Manipulation and Processing (opencv, numpy)

Neural Network (Keras and Tensorflow)

Data Visualisation (matplotlib, seaborn, )

Deployment

Docker

auto-scaling (as per request traffic)

Visualization

Kibana

python (matplotlib, seaborn)

Project Management

git (For version control)

pivotal tracker

Big data

Apache pySpark

Last updated

Was this helpful?