HTML Content Extractor is a user-friendly web app powered by Streamlit. With it, you can input a URL, retrieve and refine the HTML content from Google's web cache, and save the edited HTML. It also enables you to freely access Medium articles, bypassing member-only restrictions.
HTML Content Extractor is a simple web application created with Streamlit. It allows you to enter a URL, fetch and process the HTML content from the Google web cache, and download the modified HTML content. This tool is helpful when you want to extract and clean HTML content for various purposes. It also allows you to read Medium’s articles without being blocked by a “member-only” paywall.
To run this Streamlit app locally, follow these steps:
Clone this GitHub repository to your local machine:
git clone https://github.com/bayhaqy/HTML-Content-Extractor.git
Navigate to the project directory:
cd HTML-Content-Extractor
Install the required Python packages:
pip install -r requirements.txt
Run the Streamlit app:
streamlit run app.py
To run this Streamlit app on Streamlit.io, follow these steps:
This project uses the following Python packages:
You can install these dependencies using pip
by running pip install -r requirements.txt
.
This project was created by Bayhaqy.
This project is licensed under the MIT License - see the LICENSE file for details.