Make sure that required .Py Libraries are installed on host – [ Requirements.txt ]

pip install requests

pip install bs4


First, import the two packages (BeautifulSoup and Requests) :

     from bs4 import BeautifulSoup

import requests

Second, ask the user for an input URL to Scrape the data :

     url = input('Enter a website to extract the links from:  ')

Third, requests data from the server using the GET protocol :

     re_data = requests.get(url)

Addition – Create an if/else Statement to make sure “https” protocol is included in the URL that was inputted :

if ('https' or 'http') in url:

    re_data = requests.get(url)

else:

    re_data = requests.get('https://' + url)

Fourth, use python “html.parser”, to pull data out of HTML File :

soup_parser = BeautifulSoup(re_data.text, 'html.parser')

Fifth, create an empty list to store the links in [“Define an empty array”] :

links = []

Sixth, get all (find_all) the links from ” <a> ” tags with the attribute ” href ” , and add (append) it into the links variable created above :

for links in soup_parser.find_all('a'):
  links.append(link.get('href'))

Seventh, print and save the “links” to a file, and add ” sep = ‘\n’ ” in the print function to neatly display the output file :

with open('Kalistamp.txt', 'a') as done:
  print(*links, sep = '\n', file = done)