📙
Python Programming
  • Python Programming
  • Installation and Setup
  • Part 1: Basics
    • Variables
      • Primitive Data Types
      • Secondary Data Types
    • Control Flow and Loop
    • Functions
    • Error Handling
    • Decorators
    • Constructor
    • Built-in Functions and Modules
    • Pythonic Code
    • Miscellaneous Functionalities
    • Understanding Checkpoint I
    • Python Problem Practice I
      • Solutions
  • Part 2: Level Up
    • Real Life Application I
    • Real Life Application II
    • OOP Concepts
    • Creating Library
    • Real Life Application III
  • Processing Related
    • Parallel Processing
    • Oreilly - High Performance Python
    • Memory Management
      • Memory Leak
      • String
      • Array
      • Dictionary
    • Ubuntu CPU and RAM
    • Time and Space Complexity
  • Data Structure
    • Data Structure Overview
    • Array
    • Stack
    • Queue
    • LinkedList
    • Hash-table and Hash-map
    • Recursion
    • Binary Tree
    • Heap Data Structure
    • Graphs
      • Python Graph Visualisation
    • Dynamic Programming
    • Problem Solving Techniques
    • Additional topics
Powered by GitBook
On this page
  1. Part 2: Level Up

Real Life Application II

PreviousReal Life Application INextOOP Concepts

Last updated 4 years ago

Was this helpful?

CtrlK
  • Miscellaneous Applications
  • Kafka Functionalities
  • ElasticSearch (ES) Functionalities
  • Redis Functionalities
  • Search Engines: Books and Movies
  • Objective
  • Dataset
  • How to do it?
  • Github Repository: Basic Search Engine
  • Creating First Data Pipeline
  • Data Pipeline Flow
  • Pre-requisites
  • Application Github Repository Link
  • +Functionalities
  • Redis Functionalities
  • Search Engines: Books and Movies
  • Objective
  • Dataset
  • How to do it?
  • Github Repository: Basic Search Engine
  • Creating First Data Pipeline
  • Data Pipeline Flow
  • Pre-requisites
  • Application Github Repository Link

Was this helpful?

Miscellaneous Applications

Kafka Functionalities

  • Create a topic

  • List all topics

  • Delete a topic

  • Consume messages from a Kafka topic

  • Produce messages to a Kafka topic

ElasticSearch (ES) Functionalities

  • Writing data to ES index (writing in bulk)

  • Query ES data

  • Update ES data

Redis Functionalities

  • Different data types in Redis and their use

  • Writing data to Redis

  • Updating Redis data

Search Engines: Books and Movies

Dependencies

  • Elasticsearch (data writer using python to ES)

  • Flask REST API (for GET request)

Objective

  • We will be creating two search engines:

    • Book search engine

    • Movie search engine

  • Input should be:

    • where to search: books or movies

    • Search fields?

      • For books: author, sold copies, search based on keywords, language, genre, etc.

Hint: The same dataset of IMDb movies can also be used for recommendation engine,

  • Input is a movie name, the recommendation system returns the top 10 similar movies

  • To read more about the recommendation system, here is the link (further data-science related applications and content is under progress)

Dataset

Dataset for books: Use the JSON file prepared with scrapping

Dataset for movies: Use the attached CSV file

For the detailed (more attributes) and bigger movie dataset (85000): refer to this Kaggle link

How to do it?

  • We will be creating two indices:

    • book_dataset

    • movie_dataset

  • Write and test query in "dev tools" (within Kibana)

  • Create Flask API for communication with the application

Github Repository: Basic Search Engine

to be updated....

Creating First Data Pipeline

Data Pipeline Flow

  • Elasticsearch (data writer using python to ES)

  • Kafka reader and ES writer, data pipeline example:

    • Scrap data of books from Wikipedia and store to JSON file

    • moving data from JSON file to a Kafka topic

    • read from Kafka topic and write to designated ES index

    • Use this ES index for search engine

Hint: In the pipeline, we should not use JSON, instead directly write to a Kafka topic (here written in a JSON file first to separately write two applications for understanding)

Pre-requisites

  • Basic web scrapper

  • Kafka functionalities

  • Kafka to ES writer

Application Github Repository Link

Here, we are just bringing together different elements, different applications we have built in the previous sections.

link to be updated....

+Functionalities

  • Writing data to ES index (writing in bulk)

  • Query ES data

  • Update ES data

Redis Functionalities

  • Different data types in Redis and their use

  • Writing data to Redis

  • Updating Redis data

Search Engines: Books and Movies

Dependencies

  • Elasticsearch (data writer using python to ES)

  • Flask REST API (for GET request)

Objective

  • We will be creating two search engines:

    • Book search engine

    • Movie search engine

  • Input should be:

    • where to search: books or movies

    • Search fields?

      • For books: author, sold copies, search based on keywords, language, genre, etc.

Hint: The same dataset of IMDb movies can also be used for recommendation engine,

  • Input is a movie name, the recommendation system returns the top 10 similar movies

  • To read more about the recommendation system, here is the link (further data-science related applications and content is under progress)

Dataset

Dataset for books: Use the JSON file prepared with scrapping

Dataset for movies: Use the attached CSV file

For the detailed (more attributes) and bigger movie dataset (85000): refer to this Kaggle link

How to do it?

  • We will be creating two indices:

    • book_dataset

    • movie_dataset

  • Write and test query in "dev tools" (within Kibana)

  • Create Flask API for communication with the application

Github Repository: Basic Search Engine

to be updated....

Creating First Data Pipeline

Data Pipeline Flow

  • Elasticsearch (data writer using python to ES)

  • Kafka reader and ES writer, data pipeline example:

    • Scrap data of books from Wikipedia and store to JSON file

    • moving data from JSON file to a Kafka topic

    • read from Kafka topic and write to designated ES index

    • Use this ES index for search engine

Hint: In the pipeline, we should not use JSON, instead directly write to a Kafka topic (here written in a JSON file first to separately write two applications for understanding)

Pre-requisites

  • Basic web scrapper

  • Kafka functionalities

  • Kafka to ES writer

Application Github Repository Link

Here, we are just bringing together different elements, different applications we have built in the previous sections.

link to be updated....

For movies: director, actors, rating, keyword-based search, release year, etc.
  • Number of results (keep default as 10)

  • For movies: director, actors, rating, keyword-based search, release year, etc.
  • Number of results (keep default as 10)